10.07.2015 Views

CIICT 2009 Proceedings

CIICT 2009 Proceedings

CIICT 2009 Proceedings

SHOW MORE
SHOW LESS
  • No tags were found...

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

<strong>CIICT</strong> <strong>2009</strong><strong>Proceedings</strong> of the China-IrelandInformation and CommunicationsTechnologies ConferenceNational University of Ireland Maynooth19 th – 21 st August <strong>2009</strong>Edited by Adam C. Winstanley


Adam C. Winstanley (ed.)<strong>CIICT</strong> <strong>2009</strong>China-Ireland Information and Communications TechnologiesConference19 th – 21 st August <strong>2009</strong>Department of Computer ScienceNational University of Ireland MaynoothCounty Kildare, IrelandISBN 978 0 901519 67 2© <strong>2009</strong> the individual authors of each paperResponsibility for the contents rests entirely with the authors. The organisingcommittee accepts no responsibility for any errors, omissions or views expressed inthis publication.All rights reserved. The copyright of each of the papers published in theseproceedings remains with the author(s). No part of this publication may bereproduced, stored ina retrieval system or transmitted in any form or by any meanswithout the prior permission of the author. However, permission is not required tocopy pages for the purposes of research or private study on condition that fullreference to the source is given.Published August <strong>2009</strong>, NUI Maynooth, IrelandAdditional copies may be ordered from: <strong>CIICT</strong> <strong>2009</strong>, Department ofComputer science, NUI Maynooth, county Kildare, Ireland.ciict@cs.nuim.ieCover design by Justyna Ciepluch.Printed by Cahill Printers Limited, Clonshaugh, Dublin 17.


ForewordOn behalf of the organising committee I would like to extend my warmest welcome toall speakers and attendees at the <strong>2009</strong> China-Ireland Information andCommunications Technologies conference (<strong>CIICT</strong>) being hosted by the Departmentof Computer Science, NUI Maynooth in association with several other departmentsand research institutes on campus including the Department of ElectronicEngineering, the National Centre for Geocomputation (NCG) and the Institute ofMicroelectronics and Wireless Systems (IMWS). The NCG and IMWS were foundedas the result of grants from Science Foundation Ireland recognising the importance ofthese areas in modern science and engineering. Many thanks are also due to ScienceFoundation Ireland for awarding a generous grant to support a keynote speaker andenabling us to offer 20 bursaries to assist student participation at <strong>CIICT</strong> <strong>2009</strong>.In <strong>2009</strong>, the China-Ireland Information and Communications Technologiesconference again brings together theoreticians, practitioners and academics from thenumerous related disciplines that are painted within the broad brush of ICT. This isthe fourth conference in the series following events held in Hangzhou DianziUniversity (2006), Dublin City University (2007) and Beijing University of Posts andTelecommunications (2008).The initial call for papers for <strong>CIICT</strong> <strong>2009</strong> was issued in March <strong>2009</strong>. Contributionswere sought from researchers in China and Ireland (and also worldwide) on allaspects of Information and Communications Technology. The conference organisersreceived 65 papers from both China and Ireland with several having authors fromboth countries. Each paper was subject to a review by at least three members of theprogram committee. 31 papers were accepted for oral presentation and publication asfull papers in the proceedings. In April <strong>2009</strong>, a further call was made for short papersdescribing work in progress and preliminary results. These were each reviewed bytwo programme committee members and this resulted in a further 22 papers for shortpresentation at the conference.I wish to thank the members of the programme committee without whose commentsand prompt reviews a conference of this type could not be successful. I am alsograteful to the members of the local organising committee, the Maynooth Conferenceand Accommodation Centre and the support staff of NUI Maynooth and St. Patrick’sCollege, Maynooth.<strong>CIICT</strong> has prospered due to the generosity of sponsors who every year support theconference and the many associated prizes, competitions, workshops and socialevents. This year is no exception and we wish to thank our current sponsors for theirgenerosity: Science Foundation Ireland, Failte Ireland, Microsoft Ireland and theMaynooth Campus and Conference Centre.So welcome <strong>CIICT</strong> delegates to Maynooth. We hope the conference is beneficial andyou leave after having a useful and enjoyable experience.Adam WinstanleyChair <strong>CIICT</strong> <strong>2009</strong>


CommitteesConference ChairAdam Winstanley, CS NUIMConference Co-ChairsRonan Farrell, EE NUIMLei Yan, Peking University<strong>Proceedings</strong> sub-editorJianghua ZhengLocal Organising CommitteeYanpeng CaoBlaceq CiepluchPadraig CorcoranRicky JacobLaura KeyesPaul LewisEoin McAiodhPeter MooneyZheng PanBashir ShalaikJianghua Zheng<strong>CIICT</strong> Steering CommitteeCharles McCorkell (Chair), Dublin CityUniversityThomas J. Brazil, University College DublinGabriel M. Crean, University College CorkStephen Holland, SynopsysChris Horn, Iona Technologies PLCFiona O'Brien, Lenovo International BVDonal O'Mahony, Trinity College DublinLingling Sun, Hangzhou Dianzi UniversityRoger W. Whatmore, Tyndall NationalInstituteYinghai Zhang, Beijing University of Postsand Telecommunications<strong>CIICT</strong> <strong>2009</strong> Programme CommitteeRob Brennan, Trinity College DublinDonggang Cao, Peking UniversityGang Chen, Wuhan UniversityJiming Chen, Zhejiang UniversityPadraig Corcoran, NUI MaynoothRonan Farrell, NUI MaynoothAmy Fitzgerald, NUI MaynoothIvan Ganchev, University of LimerickDavid Gray, Dublin City UniversityBoran Guan, Hangzhou Dianzi UniversityRon Healy, NUI MaynoothMatthieu Hodgkinson, NUI MaynoothPeter Hung, NUI MaynoothHai Jin, Huazhong University of Science &TechnologyYan Ma, Beijing University of Posts andTelecommunicationsJonathan Maycock, NUI Maynooth &Bielefeld UniversityConor McArdle, Dublin City UniversityJohn McDonald, NUI MaynoothSeamus McLoone, National University ofIreland, MaynoothDerek Molloy, Dublin City UniversityRosemary Monahan, NUI MaynoothAidan Mooney, NUI MaynoothPeter Mooney, NUI MaynoothDiarmuid O'Donoghue, NUI MaynoothAidan O'Dwyer, Dublin Institute ofTechnologyDerek C W Pao, City University of HongKongPhilip Perry, QoSTech Ltd, DublinJames Power, NUI MaynoothRobert Sadleir, Dublin City UniversityRonan Scaife, Dublin City UniversityBashir Shalaik, NUI MaynoothXiaojun Wang, Dublin City UniversityChuangbai Xiao, Beijing University ofTechnologyHongwen Yang, Beijing University of Postsand TelecommunicationsYonglin Yu, Wuan National Laboratory forOptoelectronicsJianghua Zheng, NUI MaynoothSheng Zheng, China Three Gorges UniversityAnding Zhu, University College DublinFor further information about this and future <strong>CIICT</strong> conferences see:www.ciict.orgciict.cs.nuim.ie


TABLE OF CONTENTSSection 1A: ANTENNAS AND CIRCUITS 1 1Demonstrator Platform for Antenna Array Calibration 2Gerald P. Corley, Justine M. McCormack, Ronan J. FarrellA Numerical Model to Estimate PIFA (Planar Invert F Antenna) Performance withRotation Effect in Proximity to the Human Body 7Zhiyuan DuanDesign of Compact Annular-Ring Patch Antennas for circular polarization 11Xiulong Bao, Max AmmannSection 1B: LOCATION-BASED SYSTEMS 16Interpretation of Spatial Movement and Perception in Location Based Services 17Mac Aoidh, Eoin; Winstanley, AdamLocation Based Services of University Town Based on OpenStreetMap:NUI Maynoothas an example 19Zheng, Jianghua; Ciepłuch, Błażej; Mooney, Peter; Winstanley, AdamWiimote as a Navigation tool for Pedestrians 23Jacob, Ricky; Winstanley, Adam; Mac Aoidh, Eoin; Declan MeenaghFeedback Control Models and Their Application in Pedestrian Navigation Systems 25Yan, Lei; Pan, Zheng ; Winstanley, Adam C.; Fotheringham, A. Stewart; Zheng,JianghuaTram and Bus Tracker: A Dynamic Web Application for Public Transit 29Shalaik, Bashir; Winstanley, AdamSection 1C: SIGNAL PROCESSING 33Digital Audio Watermarking by Magnitude Modification of Frequency Components Usingthe CSPE Algorithm 34Wang, Jian; Healy, Ron; Timoney, JosephUsing Convolutive Non-Negative Matrix Factorization Algorithm To Perform AcousticEcho Cancellation 41Zhou, Xin; Liang, Ye; Cahill, Niall; Lawlor, RobertUsing Apodization to improve the performance of the Complex Spectral PhaseEstimation (CSPE) Algorithm 47Wang, Jian; Timoney, Joe; Hodgkinson, MatthewComputing Modified Bessel functions with large Modulation Index for Sound SynthesisApplications 52Timoney, Joseph; Lysaght, Thomas; Lazzarini, Victor; Gao, Ruiyao


Section 2A: RADIO SYSTEMS 1 56Wireless Billboard Channels established over T-DMB 57Ji, Zhanlin ; Ganchev, Ivan; O'Droma, MáirtínRF SDR for Wideband PMR 61Gao, Ling ; Farrell, Ronan JQ-Learning in Cognitive Radios 67Hosey, Neil JamesSection 2B: GEOCOMPUTATION 74Extracting Localised Mobile Activity Patterns from Cumulative Mobile Spectrum RSSI 75Doyle, John; Farrell, Ronan; McLoone, Sean; McCarthy, Tim; Hung, PeterEvaluating Twitter for use in Climate Change awareness campaigns 83Mooney, Peter; Winstanley, Adam; Corcoran, PadraigResearch on Unmanned Airship Low-altitude Photogrammetric Flight Planning andPseudo-ortho Problems 87Duan, Yini; Zheng, Wen-hua; Yan, LeiSpatial - temporal Simulation and Prediction of Sandy Desertification Evolution inTypical Area of Xinjiang 90Liu, Dunli; Zheng, Jianghua; Wang, Fei; Liu, ZhihuiSection 2C: COMPUTER NETWORKS 1 97Effect of Hard RTOS on DPDC SCADA System Performance 98Azad, AKM Abdul Malek; Hussain, C. M.; Alam, MarziaHybrid Decoding Schemes for Turbo-Codes 106Huang, Shujun; Zhan, Yinwei; Abhayaratne, CharithJavaScript code Fragment Analyzing for Cache Proxy 110Zhou, Yachao; Wang, Xiaofei; Tang, Yi; Yang, Jieyan; Wang, XiaojunSection 3A: CHANNELS AND PROPAGATION 114Research on the Gain flatness of Fiber-Optic Parametric Amplifier with PeriodicDispersion Compensation 115Jing, Jin ; Li QiliangISIS – Urban Radio plan and time-variant characteristics of mobile vehicular network 119Jeyakumar, Serenus Dayal ; Linton, DavidInvestigation of Dispersive Fading in UWB Over Fiber Systems 123Castillo Leon, Antonio; Perry, Philip; Anandarajah, Prince; Barry, LiamAn Efficient Consolidated Authentication Scheme for the Handover Process inHeterogeneous Networks 129Song, Mei ; Wang, Xiaojun


Section 3B: COMPUTER VISION 133A new algorithm of edge detection based on soft morphology 134Shang, Junna ; Jiang, Feng3D tooth reconstruction with multiple B-Spline surfaces through linear least-squaresfitting 138Zhao, Nailiang ; Ma, WeiyinAutomatic Recognition of Head Movement Gestures in Sign Language Sentences 142Kelly, Daniel ; Reilly Dellanoy, Jane; Mc Donald, John; Markham, CharlesDirt and Sparkle Detection for Film Sequences 146Gaughran, Peter ; Bergin, Susan; Reilly, RonanSection 3C: GEOSENSORS 149Lightweight Signal Processing Algorithms for Human Activity Monitoring using Dual PIRsensorNodes 150Tahir, Muhammad; Hung, Peter; Farrell, Ronan; Mcloone, Se´an; McCarthy, TimWavelength modulated off-axis integrated cavity system for trace H2S measurements 157Gu, Haitao; Yu, Dahai; Li, Xia; Gao, Xiumin; Huang, Wei; Wang, JianIntegrated air quality monitoring: applications of geosensor networks 162Hayes, Jer Patrick; Lau, King-Tong; McCarthy, Robert J.; Diamond, DermotApproximate Analysis of Fibre Delay Lines and Wavelength Converters in an OpticalBurst Switch 166Tafani, Daniele; McArdle, ConorSection 4A: ANTENNAS AND CIRCUITS 2 170Design of Integrated Stacked Spiral Thin-Film Transformer Based on Silicon SubstrateZheng, Liang ; Qin, Huibin; Daniels, StephenEquivalent Circuit Modeling for On-Wafer Interconnects on SiO2-Si Substrate 175Liu, Jun ; McCorkell, Charles; Lou, Liheng; Sun, Lingling; Wen, Jincai171Section 4B: eLEARNING 180Assessing Power Saving Techniques and Their Impact on E-learning Users 181Moldovan, Arghir Nicolae ; Muntean, Cristina HavaBilling Issues when Accessing Personalized Educational Content 188Molnar, Andreea Maria ; Hava Muntean, Cristina


Section 5A: RADIO SYSTEMS 2 196A Blind Detection Method of Non-Periodic DSSS Signals at Lower SNR 197Pu, Junjie; Zhao, ZhijinPower Consumption Analysis of Bluetooth in Sniff Mode * 201Wen, Jiangchuan; Nelson, JohnNecessity for an Intelligent Bandwidth Estimation Technique over Wireless Networks 206Yuan, Zhenhui; Venkataraman, Hrishikesh; Muntean, Gabriel-MiroSection 5B: COMPUTER NETWORKS 2 212Quality of Service in IMS based Content networking 213Li, Dalton; Lv, JonathanThe Enhanced Dv-Hop Algorithm in Ad Hoc Network 219Pin, Zhang; Zhifu, XuProgram Dependence Graph Generation and Its Use in Network Application Analysis 222Huang, Jing; Wang, XiaojunSection 6A: DIGITAL HOLOGRAPHY 226Segmentation and 3-D visualization of digital in-line holographic microscopy data 227Molony, Karen. M; Naughton, Thomas. JSpeed up of Fresnel transforms for Digital holography using pre-computed chirp andGPU processing. 234Pandey, Nitesh; Hennelly, Bryan; Kelly, Damien; Naughton, ThomasTwin removal in digital holography by means of speckle reduction 237Monaghan, David Samuel; Kelly, Damien; Pandey, Nitesh; Hennelly, BryanReview of Twin Reduction and Twin Removal Techniques in Holography 241Hennelly, Bryan M.; Kelly, Damien P.; Pandey, Nitesh; Monaghan, DavidSection 6B: INTELLIGENT SYSTEMS 246Generating Initial Codebook of Vector Quantization Based on the Maximum RepulsionDistance 247Chen, Gang, Zhong, ShengInvestigating the Influence of Population and Generation Size on GeneRepair TemplatesAmy Fitzgerald , Diarmuid P O'DonoghueIntelligent Learning Systems Where Are They Now 256George Mitchell , Colm HowlinAn Improved Haptic Algorithm for Virtual Bone Surgery 260Denis Gerard Stack , Joseph ConnellCorpus Design Techniques for Irish Speech Synthesis 264Amelia Kelly , Ailbhe Ní Chasaide, Harald Berthelsen, Nick Campbell, Christer Gobl252


Section 1AANTENNAS AND CIRCUITS 11


DEMONSTRATOR PLATFORM FOR ANTENNA ARRAY CALIBRATIONGerry Corley, Justine Mc Cormack, Ronan FarrellCentre for Telecommunications Value-Chain Research (CTVR),Institute of Microelectronics and Wireless Systems,National University of Ireland Maynooth,Ireland.Abstract-This paper presents a hardware platformfor antenna array calibration research in tower topelectronics. The platform has eight phase andamplitude controlled transmit channels and a novelantenna coupler array structure which provides nonradiativecalibration capability. The phase andamplitude of each channel can be varied between 0and 360° and over 25dB respectively under fullsoftware control. The platform has been used to testand develop array calibration routines which achieveamplitude variances of less than 1dB and phasevariances of less than 5° measured between eightchannels at the antenna connections.I. INTRODUCTIONIn order to achieve accurate beamforming it is essentialthat the elements of an array are amplitude and phasematched or that the differences are known, in additionthese relationships must be maintained in demandingenvironmental conditions such as a tower top over longperiods of time. Traditionally this has been achievedthrough the use of tight tolerance components, phasematched cables and the use of factory measuredcalibration tables, however this is an expensive approachand offers little adaptation to the ambient environmentalconditions.Amplitude and phase errors between array elementsdistort the antenna radiation pattern in terms of beampointing direction, sidelobe level, half power beamwidthand null depth [1] . The extent of the distortion has beenwell covered in antenna array literature [2], [3], [4].There are several approaches to array calibrationincluding tight tolerance design with factory determinedcalibration tables, calibration using internal and externalradiating sources and non-radiative dynamic calibration[5]. The third approach was chosen as it does not requireexternal radiators, high tolerance cables and componentsor extensive array modelling. In addition dynamiccalibration allows continuous monitoring of the arraystatus for network management purposes; this is a criticalrequirement for all cellular and wireless network operators.Desirable features of any research platform are: that itbe simple to use, that the controlling software be easy tomodify and that the hardware be easily expandable, forexample, through the addition of more array elements. Inhardware, this was achieved through the use of off-theshelfphase and amplitude adjustment componentsconfigured in a modular easily expandable fashion. Asregards software it was decided to use Labview to controlthe system, implement the calibration algorithms andcollect measurements. Labview is a graphicalprogramming language aimed and test and controlapplications, it is easy to implement graphical userinterfaces and can call both C and Matlab functions [6].II. PLATFORM ARCHITECTUREEffective non-radiative array calibration relies on theability to measure the transmit or receive signals as closeto their antennas as possible and to compare the measuredsignals with reference signals to ascertain the phase andamplitude relationship between the elements. Thereference signal(s) can be the actual transmit signal in thecase of live calibration or a pilot signal in the case of offlinecalibration. A block diagram of a distributedtransceiver array with integrated calibration/referenceblocks is shown in Figure 1. This system consists ofinterconnected transceivers and calibration blocks wherethe calibration blocks provide at least one and in mostcases multiple calibration paths to each transceiver, inaddition every transceiver calibration path is linked toevery other path through the tessellated transceiver andcalibration block structure. This multiplicity andinterdependence of calibration paths for each transceiverfacilitates the development of powerful calibrationalgorithms [7].Figure 1: Distributed Transceiver System, with built in CalibrationInfrastructure2


A block diagram of the platform is depicted in Figure 2.The heart of the demonstrator is a 2x4 antenna couplerarray; this novel design consists of an array of couplerswhere each coupler output provides coupling from thefour surrounding transceiver through paths; a moredetailed description of this coupler array and its operationis provided in [8] and [10]. In the case of transmission, thecoupler outputs provide attenuated versions of the forwardTX signals present at the through path inputs. The lossand phase shift between the through path inputRF signal to be varied from 0 to 398° and -7 to -32dBwith respect to (wrt) the input by applying DC voltages tothe phase and amplitude control inputs. The modulescomprise continuously variable voltage controlled phaseshifters (Mini-circuits JSPHS-2484) and a variableattenuator (Mini-circuits RVA-3000). These componentsare mounted on a printed circuit board with some controlvoltage level adjustment circuitry and the ensemble placedin a shielding can to minimise electromagneticinterference between array channels. Figure 3 shows aphotograph of the phase and amplitude adjustmentmodules. An external amplifier was added in series withthe phase and amplitude adjustment modules tocompensate for their insertion loss and to ensure that thesignal fed back to the phase and amplitude detectormodule would be at the input mid-range point.Figure 2: Block diagram of calibration and beamforming platformand output is small, less than 1dB and the phase variationacross all through paths is less than 2°. The coupler arrayvariations can be included as a calibration offset table inthe calibration algorithms.The coupler outputs are connected through an RFswitch to the vector voltmeter module which compares thecoupled signal with a reference signal and produces DCoutputs corresponding to the phase and amplitudedifference between its inputs. The detector module iscomprised of two Analog Devices AD8302 RF Gain andPhase Detector IC’s configured to give I and Q outputs(cos φ & sin φ), from which it possible to generate alinear monotonic output for all angles between 0 and 360°.I =A. cos φQ =A. sin φφ = tan -1 (Q / I) (1)φ` = φ for I > 0, Q > 0 (1.1)φ` = φ + 180° for I < 0, Q > 0 (1.2)φ` = φ + 180° for I < 0, Q < 0 (1.3)φ` = φ + 360° for I > 0, Q < 0 (1.4)The phase shift is described by equation (1) howeverthis produces discontinuities at 90° and 270° so theresponse is modified according to equations (1.1 - 1.4). Asregards relative amplitude, the AD8302 produces a linearoutput voltage for amplitude difference from -60dBm to0dBm.The phase and amplitude adjustment modules, one foreach array element, allow the phase and amplitude of theFigure 3: Phase and amplitude adjustment moduleThe control voltages for the phase and amplitudeadjustment modules are generated by a NationalInstruments multi-channel 13 bit digital to analogue(DAC) card (NI PCI 6723); similarly the outputs of thevector voltmeter are digitised using a NI 16 bit analogueto digital (ADC) card (NI PCI 6251). The control of theADC’s and DAC’s as well as the implementation of thecontrol loop and the calibration algorithms were all donethrough Labview.III. PLATFORM MEASUREMENTSIn the previous section the operation of the phase andamplitude adjustment and vector voltmeter modules wasexplained. In this section, measurements from thesemodules which are the core components of the platformare presented and discussed.The phase and amplitude adjustment module was testedusing a vector network analyser. By sweeping the modulecontrol voltages, the phase and amplitude of the RF signalat the output were plotted against the RF signal at theinput, this is shown in figures 4 and 5. All RF3


measurements were taken at 2.46GHz. The amplituderesponse is very non-linear but nonetheless monotonic.The phase response covers 398° and has some nonlinearityat low voltages but again is monotonic. Nonlinearityin the module’s responses is not critical as phaseand amplitude are set within a control loop which uses thevector voltmeter response to set reference points.Figure 6: Vector Voltmeter - Phase OutputFigure 4: Phase and Amplitude Adjustment Module - Phase OutputThe phase and amplitude adjustment module was reusedto generate phase and amplitude differences between theRF inputs of the vector voltmeter module. The moduleDC output levels were recorded over the full platformphase and amplitude range between its RF inputs.Figure 7: Vector Voltmeter - Amplitude OutputFigure 5: Phase and Amplitude Adjustment Module – AmplitudeOutputFigures 6 shows the vector voltmeter phase outputagainst phase input; the phase output is calculated fromthe I and Q outputs as described in the previous section.The response covers 360° before wrapping back to 0°,there is some non linearity but this could easily beadjusted for with a look up table. The relative amplitudeplot in figure 7 shows a range of 25dBs for the platform;this is much greater than the expected amplitudemismatches in a beamformer.IV. CALIBRATION ALGORITHM DEMONSTRATIONThe initial application for the platform was to testcalibration algorithms developed at the Institute. Thecalibration algorithms work by defining a single antennaelement as a reference and calibrating all other antennasrelative to that reference by following a particular routethrough the elements of the array. The choice of referenceelement and the path chosen determines the efficacy of thealgorithm in terms of accuracy and speed. The results oftesting on different algorithms have been presented in [9].In this paper the measurements from testing one of thesealgorithms will be presented as an illustration of thecapabilities of the platform.The dual path algorithm is a comparison basedcalibration algorithm. It selects a reference element in theleft hand corner of the array and then performscomparisons with the elements coupled to the referenceantenna element; it takes two paths of identical length toeach element of the array from the reference antenna.4


These two paths are averaged to reduce the effect ofcoupler errors.Figure 8 shows a photograph of the calibrationalgorithm test set-up. The antenna connections on the 2x4coupler array were connected to the inputs of a high speeddigital oscilloscope (Agilent Infinium 5483A DSO2.5GHz); unused antenna connections were terminatedwith 50Ω loads. The oscilloscope offers resolutions ofbetter than 1° in phase and better than 0.1dB in amplitudewhich is sufficient to verify the operation of the platformand algorithm.measurements for each TX output of the array arepresented in Table 1Figure 10: Oscillogram of antenna connections after calibrationTable 1: Transmit phase and amplitude measurements wrt TX1Phase wrtTX1 (°)Amplitudewrt TX1 dBTX1 TX2 TX3 TX4 TX5 TX6 TX7 TX80 1.8 -2.2 -2.2 -1.6 0.l8 0.4 -0.40 0.11 -0.27 -0.03 -0.49 -0.56 -0.22 -0.3Figure 8: Calibration and beamforming platform with antennasreplaced by a high speed digital oscilloscopeTo represent an uncalibrated system, each channel wasset to a random phase shift and amplitude attenuation byapplying control voltages to the phase and amplitudeadjustment modules. An oscillogram of the random phaseand amplitude relationships on four of the antennaconnections is shown in Figure 9. The dual path algorithmwas then run in Labview on the platform PC and the phaseand amplitude relationships were measured on theoscilloscope; Figure 10 shows an oscillogram of thesignals at the antenna connectors after calibration.The results table shows that the maximum phasedifference from the reference element (TX1) was 2.2° andbetween elements was 4°. The maximum amplitudedifference between the reference and the other elementswas 0.56dB and between all elements was 0.67dB.V. CONCLUSIONSThis paper presented a test platform for the explorationand development of tower top antenna array calibrationalgorithms and technology. The platform operates at2.46GHz and uses off-the-shelf components in a modulareasily expandable architecture. The software, Labview,allows easy configuration of the hardware andimplementation of calibration algorithms. Platformmeasurements were presented which showed a phase andamplitude control range of 0 to 360° and 25dBrespectively for each array output. Additionally acalibration routine was run on an array with antennaoutputs preset to random amplitudes and phases, theroutine succeeded in reducing the phase and amplitudedifference between outputs to less than 1dB amplitude and5° phase.Figure 9: Oscillogram of uncalibrated output signalsFrom the oscillograms it is clear that after calibrationthere is no visible phase difference and a small visibleamplitude difference between the channels. More preciseACKNOWLEDGMENTThe authors would like to thank Science foundationIreland for their generous funding of this project throughthe Centre for Telecommunications Value Chain DrivenResearch (CTVR).5


REFERENCES[1] J. K. Hsiao, "Design of error tolerance of a phased array,"Electronics Letters, vol. 21, pp. 834-836, 1985[2] N. Jablon, "Effect of element errors on half-powerbeamwidth of the Capon adaptive beamformer," Circuitsand Systems, IEEE Transactions on, vol. 34, pp. 743-752,1987[3] R. Elliott, "Mechanical and electrical tolerances for twodimensionalscanning antenna arrays," Antennas andPropagation, IRE Transactions on, vol. 6, pp. 114-120,1958[4] K. Carver, W. Cooper, and W. Stutzman, "Beam-pointingerrors of planar-phased arrays," Antennas and Propagation,IEEE Transactions on, vol. 21, pp. 199-202, 1973.[5] J. McCormack, T. Cooper, and R. Farrell, "Tower-TopAntenna Array Calibration Scheme for Next GenerationNetworks," EURASIP Journal on WirelessCommunications and Networking, vol. 2007, pp. 12, 2007[6] Y. Huang, “Design of a Dynamic Beamforming Antennafor Wimax Radio Systems”, Aerospace Conference 2008IEEE, pp.1 – 6, 2008[7] J. McCormack, T. Cooper, and R. Farrell, "A Multi-PathAlgorithmic Approach to Phased Array Calibration,"presented at Antennas and Propagation, 2007. EuCAP 2007.The Second European Conference on, 2007[8] T. S. Cooper, G. Baldwin, and R. Farrell, "Six-portprecision directional coupler," Electronics Letters, vol. 42,pp. 1232-1233, 2006.[9] J. McCormack, G. Corley, and R. Farrell, “ExperimentalResults of Non-Radiative Calibration of a Tower TopAdaptive Array”, 3rd European Conference on Antennasand Propagation, EuCAP <strong>2009</strong>. Awaiting publication.[10] T. S. Cooper, J. Mc Cormack, R. Farrell and G. Baldwin.“Towards Scalable, Automated Tower–Top Phased ArrayCalibration”, Vehicular Technology Conference, Dublin,Ireland April 23 – 25, 20076


4 mm -10 0 , -8 0 , -3 mm -8 0 , -6 0 , -4 0 ,-2 0 , 0 0 , 2 0 ,4 0 , 6 0 , 8 0 6 0 , -4 0 , -2 0 ,0 0 , 2 0 , 4 0 ,6 0 , 8 0 , 10 05 mm -12 0 , -10 0 ,-8 0 , -6 0 , -4 0 ,-2 0 , 0, 2 0 , 4 0 ,6 0 , 8 0 , 10 0 ,20mm12 0 10 mm -20 0 , -16 0 ,-12 0 , -8 0 ,-4 0 , 0 0 , 4 0 ,8 0 , 12 0 ,16 0 , 20 0-30 0 , -25 0 ,-20 0 , -15 0 ,-10 0 , -5 0 , 0 0 ,5 0 , 10 0 , 15 0 ,20 0 , 25 0 , 30 0 L—39.78 mmRe(z)1201001mm3mm2mm804mm10mm605mm4020mm200-30 -20 -10 0 10 20 30Angle(degree)1mm 2mm 3mm 4mm 5mm 10mm20mm ref8060(a)40WW2LW3HL1W1W—9.17 mmH—7.75 mmL1—22.65 mmW1—5.6 mmW2—2mmW3—2mmFigure 1: The PIFA antenna in free spaceFigure 2: The PIFA rotating along the y axis(Clockwise rotation represents a negative angle andcounterclockwise rotation represents a positiveangle)Im(z)200-30 -20 -10 0 10 20 30-20-40-60Angle(degree)1mm 2mm 3mm 4mm 5mm 10mm20mmref(b)Figure 3: (a) The real part and (b) imaginary partof port impedance variations with separation androtation angles at 2.44 GHz. The curves of refrepresent the PIFA port impedance in free space,where the ripple is caused by simulation numericalerrorsPort impedance is a parameter to demonstrateantenna detuning effects in a complex userenvironment, which impacts the resonant frequency andavailable power to antenna. To list the detail effects toantenna, the impedance is divided into real andimaginary parts instead of the usual S 11 . As exhibited inFigure 3 (a), the real part of the port impedancedecreases with rising separation and the deviationdecreases as well. Figure 3 (b) shows the imaginarypart of the port impedance, the slopes of the curves aregradually reduced with the growing separation whichcould be explained by weaker coupling between thePIFA and the human phantom. Approximate equationsare introduced to described these variations.Equation 1: The real part of the port impedanceRe( z ) = f d + n , d [1]( ) ( θ )8


96.354 − 28.546 * ln( d)d < 10mmf ( d)= {35 10mm≤ d ≤ 20mmwhen 1 mm ≤ d < 10mmn( θ , d ) = ( 4.77 −1.94∗ ln( d ))*θwhen − 30 < θ < 0 and 10mm≤ d ≤ 20mmn ( θ , d ) = (0.0161* d − 0.8488) * θwhen 0 ≤ θ ≤ 30 and 10mm≤ d ≤ 20mmn θ,d= −0.1357 ∗( ) θEquation 2: The imaginary part of the portimpedancewhen 2 mm ≤ d < 20mmIm( z ) = g( d )*θ + m( θ ) [2]g ( d)= 3.1713∗ln( d)− 8.7871m ( θ ) = −10 .723* ln( θ ) + 14. 409when d ≥ 20mmIm( z ) ≈ 0[3]3. PIFA radiation efficiency variations withdistance and rotation angleThe radiation efficiency is defined as being the totalradiated power divided by the maximum availablepower when the antenna is impedance matched.Especially in wearable antenna, the human phantomwill absorb part of the radiator energy due to themicrowave heating effect. The efficiency indicates thetotal obtainable gain of the PIFA, which also has animpact on battery lifetime.0.8s(d)∗θ+ 0.29 1mm≤ d ≤ 5mmηr= {[4]q(d) * θ + w(d)5mm< d ≤ 20mms ( d)= −0.0149∗ln(d)+ 0.0212q ( d)= 0.0079∗ln(d)− 0.0242w ( d)= 0.0285∗d + 0.15274. PIFA radiation pattern variations withdistance and rotation angleThe radiation pattern always changes with differentantenna orientation whatever a PIFA antenna is utilizedin off-body or on-body channels. Off body channelneed more energy propagating away from body surface,however on-body channel need energy propagatingalong body surface.Here we display several radiation patterns atdifferent distance and angles that will provide us aclear picture about the body effect on PIFA radiationperformance.1050-5-10-15-20-25-30-35ref 1mm 2mm 3mm 4mm 5mm10mm20mm(a)Radiation Efficiency20mm5mm4mm3mm2mm0.70.60.50.40.30.20.11mm10mm50-5-10-15-20-25-30-350-30 -20 -10 0 10 20 30Angle (degree)1mm 2mm 3mm 4mm 5mm 10mm 20mmFigure 4: The radiation efficiency changing withseparation and rotation angleref 2mm_0 2mm_5 2mm+5(b)Figure 5: The E plane (ZX plane) Gain pattern (a)at 0 degree rotation with different separation (b) at2mm separation with different rotation angle9


The results show that the separation (D) has a moreinfluence on gain pattern. The available gains in themain lobe increase with the rising separations whichagree with the antenna efficiency variations in Figure 4.There have clear attenuation in the lower half planecompared to that of free space due to the phantomabsorptions regardless of the separation.Small rotation has limited impact on antennaradiation pattern as shown in Figure 5 although itsufficient affects the port impedance and efficiencyexplained in previous sections.5 ConclusionsA PIFA antenna used in WPN (Wireless PersonalNetwork) is reported in this paper considered itsdifferent separation and rotation angles. Simulationsprove that the port impedance and the radiationefficiency have strong correlation with its positions andorientation. Several equations are first reported todescribe the port impedance and efficiency as functionsof D (the separation) and θ (rotation angle). Theradiation pattern is not sensitive to small rotations. Inthe future, detailed measurement will be carried outconfirm those equations.6. AcknowledgementThis work has been supported financially by theEngineering and Physical Sciences Research Council(EPSRC), United Kingdom under GrantEP/D053749/1. The authors wish to thank ourconsortium partners: Zarlink, Taconic, EuropeanAntennas and the Home Office Scientific DevelopmentBranch for their encouragement and support.7. Preferencesfull-scale human body model,” IEEE Trans. MicrowaveTheory Tech, vol. 45 iss:1, pp.118-125, 1997.[5] Http://niremf.ifac.cnr.it/tissprop/, <strong>2009</strong>.[6] K. L. Wong and C. I. Lin, "Characteristics of 2.4-GHzcompact shorted patch antenna in close to a lossy medium,"Microwave Opt. Technol. Lett., vol. 45, pp. 480-483, 2005.[7] H. R. Chuang, W. T. Chen, “Computer simulation of thehuman-body effects on a circular-loop-wire antenna forradio-pager communications at 152, 280, and 400 MHz,”IEEE Trans. Vehicular Technology, vol. 46, pp.544-559,1997.[8] W. G Scanlon, N. E Evans and M. Rollins, “Antennabodyinteraction effects in a 418 MHz radio telemeter forinfant use,” Annual International Conference of the IEEEEngineering in Medicine and Biology - <strong>Proceedings</strong>, pp.278-279, 1996.[9] H. E. King and J. L. Wong, “Effect of a Human Body ona Dipole Antenna at 450 and 900 MHz,” IEEE Trans.Antennas Propag, Ap-25, no. 3, pp. 376-379, 1977.[10] H. R. Chuang, “Human operator coupling effects onradiation characteristics of a portable communication dipoleantenna,” IEEE Trans Antennas Propag, vol. 42, no. 4, pp.556-560, 1994.[11] M. F. Iskander, Z. Yun and R. Q. Illera, “Polarizationand human body effects on the microwave absorption in ahuman head exposed to radiation from handheld devices,”IEEE Trans. Microwave Theory Tech, vol. 48, no. 11,pp.1979-1987, 2000.[12] D. Nashaat, H. Alsadek and H. Ghali, “Investigation ofthe mutual effect between human head and new shapes ofPIFAs used in mobile communication systems,” MicrowaveOpt. Technol. Lett, vol. 46, pp.243 -248, 2005.[1] P. S. Hall and Y. Hao, Antennas and propagation forbody-centric wireless communications, Boston, Mass.London, Artech House, ISBN 978-1-580-53493-2, 2006.[2] T. A. Milligan, Modern antenna design —2nd ed,Hoboken, New Jersey, John Wiley & Sons, ISBN-13 978-0-471-45776-3, 2005.[3] A. Christ, A. Klingenbock, T. Samaras, C. Goiceanu andN. Kuster, "The dependence of electromagnetic far-fieldabsorption on body tissue composition in the frequencyrange from 300 MHz to 6 GHz," IEEE Trans. MicrowaveTheory Tech., vol. 54, no.5, pp. 2188-2195, 2006.[4] H. R Chuang,. “Numerical computation of fat layereffects on microwave near-field radiation to the abdomen of a10


Design of Compact Annular-Ring Patch Antennasfor Circular PolarizationX. L. Bao and M. J. AmmannCentre for Telecommunications Value-chain Research,School of Electronic & Communications Engineering,Dublin Institute of Technology, Kevin Street, Dublin 8, IrelandAbstractSeveral novel compact annular-ring patch antennasfor circular polarization are presented. Twotechniques are employed to reduce the antenna sizeand provide a suitable input impedance match, one isto insert strips into the annular ring and the other isto place a cross-slot into the ground plane. Theproposed annular-ring patch with a cross-slottedground plane can obtain a much smaller size for agiven frequency. The resonant frequencies of thesenovel antennas can effectively be reduced due toincreased path length of surface current. Dualannular-ring patch antennas with an embeddedcircular patch can provide dual frequency circularpolarization characteristics. The performances of theproposed antennas are discussed.Keywords: Annular-ring antennas, circularpolarization, cross-slot, dual-frequencyIntroductionWith the increased development of wirelesscommunications systems, miniaturization ofcircularly polarized antennas have become moreattractive to the engineering researcher. Annularringpatch antennas have a smaller dimensioncompared to other square and circular patchantennas [1-5]. If annular ring is embedded with apair of notches and a strip in inner circle, the antennawill exhibit circular polarization characteristics.Aperture-coupled microstrip fed annular-ring patchantennas have also been shown to produce circularpolarization [6].In this paper, two techniques are applied to thecircularly polarized annular-ring patch antenna:embedding strips in the annular ring and cutting acrossed-slot into the ground plane. For a patchantenna with a narrow annular-ring, it is verydifficult to achieve a good match to the 50 Ohmimpedance of the coaxial probe. To obtaincompact patch size and circular polarization at agiven frequency, these branch strips are employedto match the coaxial probe. The proposedstructures can effectively miniaturize the patchantenna size. A significant further size reductioncan be achieved by augmenting the annular-ringpatch antenna with a crossed-slot in the groundplane [7-8]. Various novel annular-ring circularlypolarized antennas are designed and studiedexperimentally, and the circular-polarizationperformances are evaluated.Annular Ring Patch with Embedded StripsIt is well know that annular-ring patch antenna issmaller than the rectangular patch or circular patchantenna. At the same time, the investigations onthe annular-ring indicate when the inner radius ofannular-ring is increased, the resonant frequency isdecreased. But as the inner radius becomes large, ahigh impedance is created and it is difficult toprovide a suitable match to 50 Ohms. So, somematching strips are placed inside the annular-ringto obtain good matching at the low frequency. Inthis paper, three new structures embedded intocross-strip patch are presented in Figure 1 a, b andc. If the position of feed point is properly selected,the two orthogonal modes of the annular-ringpatch antennas can be excited with equalamplitude and 90 degree phase difference at agiven frequency. Thus, the characteristics ofcircular polarization for annular-ring are realized.Three patch antennas presented in this paper areprinted on FR4 substrate, of relative permittivity4.2, of thickness 1.52mm, loss tangent 0.02. Usingoptimized results, the antenna dimensions forAntenna A, B, C are list in the table 1. For the11


same size of annular-ring outer radius, themeasured return loss and axial ratio properties ofthree different circular polarized annular-ringantennas are plotted in the Fig 2 and Fig 3,respectively. It is found that three antennas havegood impedance bandwidth and axial-ratioperformances. The normalized spinning radiationpatterns in the XoZ plane for the three types areshown in Fig.4 (a), (b), and (c) at their individualcentre frequencies (1.562GHz, 1.556GHz,1.452Ghz), respectively.oyxwAxial Ratio (dB)5.04.54.03.53.02.52.01.51.00.50.0Antenna AAntenna BAntenna C1.44 1.46 1.48 1.50 1.52 1.54 1.56 1.58 1.60 1.62Frequency(GHz)Figure 3 Axial ratio for the three antennas0330030R2R2wL1R2R1-1030060wL1R1L1Feed PointR1Feed PointL2Feed Point-2027090-20(1) Antenna A (2) Antenna B (3) Antenna C-10240120Figure 1. Annular-Ring Patch with embedded stripsTable 1 The parameters of three annular-ring patch antennasembedded cross-stripsParameters(mm)No.R1 R2 W Feedpoint L1 L2Antenna A 24.8 8.2 1 (-7,7) 4 -Antenna B 24.8 7.8 2 (-6.6,6.6) 3.8 -Antenna C 24.8 12.8 2 (-3,3) 59 9.800-10-20-30-40-30-20-10300270240210330(a) 180015030609012000210150180(b)0033030S11(dB)-10-10-2030060-20Antenna AAntenna BAntenna C-30-20-102702401209002101501.35 1.40 1.45 1.50 1.55 1.60 1.65 1.70 1.75Freuqency(GHz)180(c)Figure 2. S11 for three annular-ring patch antennasFigure 4. Spinning dipole radiation patternsfor the antennas at different frequencies12


Annular-ring Patch Antenna with EmbeddedCircular Patch and a Cross-slotted Ground PlaneIn order to miniaturize the annular-ring antenna, across-slot in the ground plane is employed tocompact annular-ring due to increased surfacecurrent path. To match the annular-ring patch, acircular patch placed in the centre of annular-ringpatch is used, as shown in Figure 5 (a). Figure 5 (b)shows the geometry of the proposed dualfrequencycircular polarized characteristics. Thesepatches are printed on FR4 substrate, ofpermittivity ε r = 4. 0 and thickness 1.52 mm. Thecrossed slot in the groundplane has unequal laterallengths, L1 and L2, with a slot width w. Thisstructure can excite two degenerate orthogonalmodes with equal amplitude and 90 degree phasedifference by tuning various parameters (R2, R3,L1, and L2) and right-hand circular polarization(RHCP) radiation is obtained. The optimizeddimensions selected are listed in table 2. In orderfor better matching of input impedance, a circularslot of radius R4 = 4mm is located at the centre ofthe cross slot. This antenna is excited by a 50 ohmcoaxial probe, and position of feedpoint along thediagonal line is also listed in table 2.WpFeed Point0R1R2(a)W2R20 Feed PointR1W1L2L1(b) The proposed dual-frequency CP antennaFigure 5. The geometries of the annular-ring patch antennasloaded by circular patch with cross-slotted ground planeFor the single frequency CP antenna, incomparison to the conventional circular patchantenna, the centre frequency of the proposedannular-ring patch antenna is smaller by 55 % forthe same substrate and outer circular radius. Themeasured 10 dB return loss impedance bandwidthfor the proposed antenna is approximately 65 MHz(6.1%) from 1.039 GHz to 1.094 GHz, as shown inFigure 6.R3W(a)(b)(c)xyz(b)0SD1L2W(c)S11(dB)-10Simulated resultMeasured resultL1-20(a) The proposed single-frequency CPantenna-300.85 0.90 0.95 1.00 1.05 1.10 1.15 1.20 1.25Frequency(GHz)Figure 6. Measured and simulated S11 for the singlefrequency CP antenna.13


[4] C.Y.Sim, K.W.Lin, J.S.Row, Design of AnAnnular-Ring Microstrip Antenna for CircularPolarization, IEEE Antennas and PropagationSociety International Symposium, Vol.1, 2004,pp.471-474.[5] H.M.Chen and K.L.Wong, On the circularpolarization of annular-ring microstrip antennas,IEEE Transactions on Antennas and Propagation,Vol.47, No.8, 1999, pp.1289-1292.[6] J.S.Row, Design of Aperture-CoupledAnnular-Ring Micrsotrip Antennas for CircularPolarization, IEEE Transactions on Antennas andPropagation, vol.53, No.5, 2005, pp.1779-1784.[7] X.L.Bao and M.J.Ammann, Compact AnnularringEmbedded Circular Patch Antenna with aCross-slot Ground Plane for Circular Polarization,IEE Electronics Letters, 2006, 42, (4), 192-193.[8] X.L.Bao, and M.J.Ammann, A single-layerDual-frequency Circularly-polarized PatchAntenna with Compact Size and Small FrequencyRatio, IEEE Transaction on Antennas andPropagation, Vol.55, No.7, 2007, pp.2104-2107.Table 2. Dimensional parameters for the antennasParameters(mm) R1 R2 R3 W1 W2 L1 L2 SD1 Wp FeedPointSingle CP Antenna 24.8 9.0 - - - 48.0 50.0 4.0 2.8 (-5,-5)Dual CP Antenna 24.0 18.9 6.5 0.8 7.5 40 42.4 -- -- (-3,-3)15


Section 1BLOCATION-BASED SYSTEMS16


Interpretation of Spatial Movement and Perceptionin Location Based ServicesEoin Mac AoidhNational Centre For GeocomputationNational University of Ireland MaynoothIrelandEmail: eoin.macaoidh@nuim.ieAdam WinstanleyDepartment of Computer ScienceNational University of Ireland MaynoothIrelandEmail: adam.winstanley@nuim.ieAbstract—Location Based Services should deliver pertinentinformation to the user at the right place and at the right time.Such is the range of available content, that it must be filteredand prioritised according to the user’s context to reduce waittime and eliminate the delivery of unwanted information. Whilesome contextual information can be inferred from the devicesensors, such as location and time, a deeper understanding of theuser’s context can be inferred by combing these sources with animplicit interpretation of the user’s actions. This paper proposesan experiment to compare the user’s actions in a real worldenvironment to his actions in an identical virtual world, enablingaccurate contextual inferences to be made. The real world studyallows an analysis of real movements, which can be correlatedwith movements in the virtual world, with a greater potential foradditional psychological analysis as part of the virtual world.I. INTRODUCTIONAs the proliferation of location aware mobile devicesincreases due to continuous technology improvements, thedemand for Location Based Services (LBS) is increasing. Thegoal of a LBS is to deliver appropriate content relevant toa specific location (usually the user’s current location) at thetime when it is required. For example, a LBS might deliverinformation on nearby restaurants to the user, however, a welldesigned LBS must take account of more than just location.Several other contexts, such as time of day, menu preference,and whether the user is hungry or not, in this example, arecrucial to the development of an intelligent service. The LBSmust employ any contextual information which can be gleanedimplicitly to its advantage. In this case, if it can be inferred,based on device location, that the user has been sitting in arestaurant for the last 45 minutes, it is likely that the user haseaten, and that restaurants are no longer a priority for the user.Therefore the order of priority content must be recalculatedaccording to the user’s changing context.LBS are designed for use on a mobile device, over a wirelessnetwork. Transmission of information is often affected bypoor network coverage and slow processing on the device.Furthermore, the quantity of spatially referenced informationavailable is continuously growing. As a result, it is crucialfor the most important content to be prioritised for deliveryto the user’s device to reduce latency for the user. Moreover,irrelevant information must be elided to avoid swamping theuser with information. In order to provide such a contextual,personalised LBS, the user’s context, intentions, interests andpreferences must be accurately interpreted from all availablesources. We propose the development of a testing environmentfor the interpretation of a user’s spatial movementsand perceptions, both in a real world and in an identical3D virtual environment. By understanding the user’s actionsand correlating them to what the user can see in the worldaround him, in relation to what is visible on the devicescreen, issues such as the user’s concept of proximity, i.e.what services are ’nearby’, and relevance, i.e. features thatare used for orientation purposes, those which correspond tofeatures/services of interest, and those which are irrelevant,can be explored. An improved understanding of these usercontext issues paves the way for the delivery of an improvedLBS, adding value to the service, and enhancing the user’sexperience.II. RELATED WORKImplicit Interest Indicators (III) [1] are employed by manyapplications with a user modelling component. They are ameans of interpreting the user’s interest in a particular item.For example, the printing of a Webpage as an III signifiesan interest in the contents of the page [1]. In the context ofa spatial browser, zooming in on a specific area signifies aninterest in its contents [2] [3]. By studying sequences of operations,it is apparent that certain combinations of operationshave different meanings, depending on their order of execution[4]. Furthermore, the strength of inferences made based onuser interaction varies depending on the context in which theywere executed [2]. For example, it has been shown that attimes when the mouse is not actively engaged in interfacemanipulation, user’s eye movements, and consequently themental processing of the information on screen is connectedto the location of the mouse cursor [5]. At other times,however, when the mouse is being actively engaged to panand zoom through the map for example, its location in relationto the contents of the map is not as indicative of the user’sobjects/areas of interest [2].The cognitive processes involved in the human brain whileprocessing traditional 2D maps on paper, and in digital formare explored in [6] [7]. The authors explore the possibilitiesof exploiting Cognitive Load Theory (CLT) for more17


device in relation to the objects which the user can see inthe real/virtual world. Such an experiment comparing andcontrasting navigation in identical real and virtual worlds is ofimportance, as further analysis can be conducted in the virtualworld which is not possible in a real world scenario. Analysisincluding eye tracking and EEG monitoring in the virtualworld would follow, which are not feasible in the real world asthe equipment is intrusive and cumbersome. Inferences aboutthe user’s eye movements and brain activity in the real world,e.g. for navigation, can be made if the correlation between realand virtual world performance is well tested and documented.Fig. 1.A screenshot showing a section of the 3d campus model.comprehensive and robust methodologies for 2D map constructionand analysis. Human cognition in relation to virtual3D environments is explored in [8]. The human perceptionof directionality between virtual landmarks is exploited forthe improvement of spatial learning. The test environmentproposed in this paper is concerned with the inferences aboutthe user’s notions of spatial perception in both two and threedimensions which can be made from the physical movementsof the user through both real, and virtual 3D space. Theinferences which can be made from user’s physical interactionswith a map interface, including mouse movements and mapmanipulation interactions, were explored in [2]. It is intendedto bring this concept of physical movement analysis for theinterpretation of the user’s spatial perception to both the real,and virtual 3D space for experimentation.III. PROPOSED EXPERIMENTThe test bed for the proposed experiment is a universitycampus. An accurate 3D model of the campus has beengenerated from LiDAR point clouds. The model covers anarea of 500m by 700m, containing 14 buildings represented inexceptional detail. The internal representations of some of themain buildings have also been built. This gives us an identical,accurate 3D virtual representation of the real world campus,a section of which is shown in Figure 1. Such is the scaleand detail of the model, that experiments could be carried outat scales varying from an area covering 5m to 500m, or eveninternally within the buildings, over multiple floors.In the proposed set of experiments, users will carry a mobiledevice offering an interactive 2D map of the campus withinformation on all of the points of interest. A number ofpredefined tasks will be carried out, requiring them to navigatethrough the campus. Similar tasks will subsequently be carriedout in the virtual environment. The same 2D application on themobile device will be available to the user. Their navigationthrough the real world environment will be assessed, andcompared to their navigation through the virtual environmentat a range of scales. Of primary interest are physical movementpatterns which could act as implicit interest indicators aboutthe user’s context, and the contents of the map on the mobileIV. CONCLUSIONLocation Based Services must deliver contextually relevantcontent in order to provide a useful service to the user,minimising download time and reducing information overload.Certain fundamental inferences about location and time etc.can be made based on the device sensors. Our goal is to enrichthis contextual information by implicitly inferring additionalinformation about the user based on the relationship betweenhis/her physical movements and corresponding interactionswith an associated 2D map on a mobile device. We proposean experiment to examine the relationship between physicalmovements through space and 2D map browsing in identicalreal and 3D virtual worlds. By assessing the correlationbetween the real and virtual worlds, it could be possible toconduct an additional psychological analysis on interactionsin the virtual world using equipment which would be inappropriatein a real world scenario.ACKNOWLEDGMENTResearch presented in this paper was funded by a StrategicResearch Cluster grant (07/SRC/I1168) by Science FoundationIreland under the National Development Plan. We gratefullyacknowledge this support.REFERENCES[1] M. Claypool, P. Le, M. Waseda, and D. Brown, Implicit Interest IndicatorsIn <strong>Proceedings</strong> of IUI’01, pp. 33-40, 2001, Santa Fe, New Mexico, USA.ACM. 2001.[2] E. Mac Aoidh, M. Bertolotto and D.C. Wilson, Understanding GeospatialInterests by Visualising Map Interaction Behaviour In InformationVisualisation, Vol. 7, No. 3-4, pp. 257-286, Palgrave, <strong>2009</strong>.[3] T. Tezuka and K. Tanaka, Presentation of Dynamic Maps by EstimatingUser Intentions from Operation History In <strong>Proceedings</strong> of MMM2007,pp 156-165, January 2007, Singapore. Springer Verlag.[4] M. Hirose, R. Hiramoto, and K. Sumiya, GeminiMap - GeographicalEnhanced Map Interface for Navigation on the Internet In <strong>Proceedings</strong> ofW2GIS’07, pp. 279-292, November 2007, Cardiff, UK. Springer Verlag.[5] M.C. Chen, J.R. Anderson and M.H. Sohn, What Can a Mouse CursorTell Us More? Correlation of Eye/Mouse Movements on Web Browsing In<strong>Proceedings</strong> of CHI’01, pp. 281-282, Seattle, Washington, USA. ACM.2001.[6] A.K. Lobben, Tasks, Strategies, and Cognitive Processes AssociatedWith Navigational Map Reading: A Review Perspective,The ProfessionalGeographer,56:2, pp. 270281, 2004.[7] R.L. Bunch and R.E. Lloyd, The Cognitive Load of Geographic Information,TheProfessional Geographer,58:2, pp. 209220, 2006.[8] W.S. Albert, R.A. Rensink and J.M. Beusmans, Learning relative directionsbetween landmarks in a desktop virtual environment, SpatialCognition and Computation 1: pp. 131144, Kluwer. 1999.18


III.DATA COLLECTIONAccurate spatial data and detailed geographicalattribute information are the basis of LBS applications forany environment including university towns. In Irelandthere are generally no suitable free data that can be directlyused for such applications. In Maynooth, for example,Google Earth does not provide good resolution remotesensing images for navigation. Also Google Maps does notprovide attribute data for POI queries and pedestriannavigation. Furthermore, obtaining high resolution remotelysensed imagery from commercial companies would incur alarge financial cost that the project cannot support.Beginning in 2008 a cost effective method for creating amap of the campus was sought for NUI Maynooth and thesurroundings area. This map creation task was necessarybecause this map data was required for use in theprototyping pedestrian navigation application. The fastestway to solve this problem would have been to use an “out ofthe box” mapping system like Google Maps or VirtualEarth. The other more costly option was to purchase the dataabout streets and paths in this area from a company whichprepares maps for GPS devices.For pedestrian navigation a very detailed map of thecampus area is required. This map must, by default, includeall streets on which it is possible to drive but the map mustalso include pedestrian ways (paths, lanes, walkways, trails).A quick, efficient, and accurate solution to this problem wasto create the map ourselves. OpenStreetMap is a free andopen map of the world.OpenStreetMap allows users to upload GPS data, aerialphotography, and other spatial sources for inclusion on theOpenStreetMap (OSM) map of the world. There are manydocuments on the OSM Wiki pages to get started with mapcreation. A GPS logger device was obtained for GPScoordinate collection. A GlobalSat® DG-100 GPS DataLogger was used. This device is not a complicated GPSdevice. It contains a button for on/off and three diodeswhich show current status and a simple trigger to allow theoperator to choose how frequently the measurement ofposition is captured and recorded into the device's internalmemory. This device is very widely use by theOpenStreetMap community. It is very easy to use thisdevice with Linux servers as drivers are available using anapplication called GPSbabel [1]. Based on the same code, aspecial plug-in for an OSM editor called JOSM wasmanufactured by the OpenStreetMap community whichdownloads routes of journeys directly to a spatial dataeditor.At the initial stages of map creation for Maynooth, itwas decided that we would collect as much geographicaldata and information as possible about the vicinity. Thefinal versions of the maps could be used then by members ofthe local Maynooth community and for us in universityprojects such as this project on pedestrian navigation. Weare hopeful that the “open access to data” philosophy ofOSM [2] will mean that it will not be necessary for otherprojects requiring mapping to re-invent the wheel andcollect their own data. The OSM map can be continuallyupdated and edited as geographical features change in thearea or as more detailed or new geographical data iscollected and uploaded to OSM. Due to this work, at presentthe OSM map of Maynooth better represents thegeographical reality of the area than Google maps or VirtualEarth. It also offers the University the opportunity toquickly add or remove geographical features to the OSMmap to accurately reflect changes in the physical campusstructure – for example new footpaths or the relocation offacilities such as postboxes.OSM data can be accessed in XML format. The XMLformat is verbose and shows all of the attributes for a line(roads, streets, paths) or point (Point of Interest, shop,amenity) feature on the map. An example is shown belowfor one point of interest (POI) which is Brady's PublicHouse on Main Street of Maynooth. The subset of the XMLis shown as follows.In graphical format Brady's pub is displayed on a graphicalmap tile by the Mapnik software package (figure 2).Figure 2 data collection sampleThe first step in building the OSM map starts withtraveling along streets, lanes and paths, in the town andcapturing GPS coordinates for these features. This is a timeconsuming process. Before map data collection began theMaynooth town area was divided into small manageable20


segments. The process of data collection, editing, and finalproduction upload to the OSM servers took between twoand three days for each segment. In Ireland it is difficult toaccess high resolution aerial imagery without incurring avery large cost. This meant that the only method to capturehigh resolution data was to physically visit everygeographical feature in Maynooth which we would like tohave on the map.The typical procedure was to travel the streets and lanesby bicycle to capture their shape in GPS coordinates. Datacapture by bicycle was useful for several reasons. Firstlythere were some examples of footpaths hidden betweentrees and bushes and could only have been captured usingground survey. The bicycle also allowed travel alongpedestrian-only streets and lanes. At the same time therelative location of the geographical details was captured ofall other important information particularly POIs such asspeed bumps, street lights, bus stops and shops. Photographswere taken with a digital camera to enhance the accuracy ofthe placement of these POIs on the map and in their physicaldescription.Upon returning to the laboratory the data from the GPSlogger were uploaded. Street and line feature data wasuploaded first followed by adding information on POIs. TheJava OpenStreetMap (JOSM) [3] editor was used for thistask. The JOSM has a special plug-in which allows theeditor to match digital photographs taken during the surveywith points on the line representing the line feature. This isdone by comparing points on the GPS coordinate line withthe time stamps of the JPEG photographs taken by thedigital camera.Figure 3 data collection sampleThis approach to map creation is not perfect as it isdifficult to capture the shapes (plan view) of buildings. It ispossible to create reasonable approximations to the shape oflarge buildings by simply walking around then andrecording coordinates with the GPS device. However this isnot suitable for all buildings as it is often the case that notevery corner or side is accessible by pedestrians. Therefore,the National Centre for Geocomputation (NCG) at NUIMaynooth provided us with an aerial photograph of theMaynooth campus.The dimensions of this image are 7113x8810 pixels and itwas resolution-rectified using the on-line tool “Mapwrapper” [4]. The aerial photograph does not cover all ofMaynooth. However the photograph has allowed us tocreate a complete map of Campus buildings and campuspath ways. 'Map wrapper' provides a Web Map Service(WMS) which, in combination with the JOSM plug-in,allows a direct connection. It was then a simple task to placethis aerial photograph as a background in JOSM and simplytransfer the necessary features onto the map. The final resultof this map production work is a high quality map of ouruniversity campus as figure 3).The map offers greater resolution and more geographicalattribute detail than maps offered by Google Maps orMicrosoft Virtual Earth for the same location. The map ofMaynooth (town and campus) is presented by the OSMpublic server but can be used freely by any third party. Inparticular, university research projects can use this mapwithout constraints of map license structure or mappurchase/usage costs. OSM is far from being “the finishedarticle” for Ireland. Many locations in Ireland are verypoorly mapped and contain little or no geographic detail. Itis our intention to disseminate our knowledge of OSM andmap production widely to other researchers and the generalpublic in Ireland. It is hoped that this will help inspire othercitizens to become part of the OSM community and assist inimproving the representation of the island of Ireland inOSM. This data collection solution makes the geographicaldata updating of the OSM maps a very minimal costexercise. It provides an opportunity for users to add newPOI or other information into the database. For pedestriannavigation it is very important to have the informationupdated frequently.IV.PEDESTRIAN NAVIGATIONWhen visitors use LBS applications in a university townmost of their travel mode is walking. Pedestrians are notconstrained to the road network (for example the lanes, turnrestrictions, one-way streets) unlike vehicle drivers [5], [6].In addition there are some special features that are unique topedestrian navigation. These include “walking areas” wherepedestrians can walk freely such as squares, grassland, parksand open ground. This is one of the key differencescompared with the road networks used for vehicle21


navigation. We define a walking area in 2D Euclidean spaceas an area where pedestrians can walk at random withoutusing fixed paths. A walking area is generally representedby a polygonal feature. Zheng [6] discussed data modelingof pedestrian networks including walking areas. In formerwork [8] we classified walking areas into three types withorthogonal attributes including the character of theirboundaries and entrances, concave-convex characteristics oftheir shape and the presence of impenetrable islands.Walking areas can be divided further into three subtypesderived from the access character of the boundaries andentrances: fixed entrances, open boundary (free entrance),and open boundary with restrictions. Considering thesubtypes of the other orthogonal factors in characterizingpolygon shapes there should be 12 general situations. Fordeveloping robust pedestrian navigation services the entireset of situations must be taken into account.Pan [7] put forward an algorithm generating optimalpaths within a polygon (a walking area) with interiorobstacles. This algorithm was based on Dijkstra’s algorithm.This work also provided some solutions for the specialbehaviors of pedestrians, such as preference for easywalking routes and preference for indoor routes.Zheng [8] describes a Two-Level Path PlanningAlgorithm for pedestrian navigation. In a related papercurrently under peer review the author describes the solutionfor pedestrian navigation with open boundary areas. Thesolution solves the problem to represent the link passingthrough open boundaries in any directions by building upthe connecting relationships between the open boundaryarea and other related spatial nodes (including simplifiedadjacent open boundary areas). At the first level this takesan open boundary area as one node of a link-node networkfor path planning outside walking areas. The second level isused for optimal path planning inside the walking areas. Thedetailed algorithm will be published in the near future.V. CONCLUSION AND FUTURE WORKOur goal is to obtain a cost effective way for datacollection and software deployment for LBS applications forthe case study of pedestrian navigation in a university town.A 2D data collection solution has been adopted based onOpenStreetMap. This paper has described our preliminarywork so far. Future work will focus on better optimal pathrepresentation for various topographic situations.Furthermore, we will develop test applications for mobileterminals to allow user testing to be performed. Figure 4shows a 3D data sample of NUIM campus obtained as partof the StratAG project. The extension of pedestriannavigation to 3D scenarios (for example, inside buildings)will also be investigated.ACKNOWLEDGMENTResearch presented in this paper was funded by aStrategic Research Cluster grant (07/SRC/I1168) by ScienceFoundation Ireland under the National Development Plan.The authors gratefully acknowledge this support.REFERENCES[1] Christoph Eckert, GPS Babel application, http://www.gpsbabel.org[2] Mordechai (Muki) Haklay and Patrick Weber, OpenStreetMap: User-Generated Street Maps, IEEE Pervasive Computing, 7, (4), 12-18,Oct.-Dec. 2008.[3] Immanuel Scholz, JOSM application, http://josm.openstreetmap.de/[4] Tim Waters, Map Wraper application, http://warper.geothings.net/[5] Christian Gaisbauer and Andrew U. Frank. Wayfinding Model forPedestrian Navigation, 11th AGILE International Conference onGeographic Information Science 2008.[6] Jianghua Zheng, Jianwei Tao, Jianli Ding, Abudukim Abuliz, andHanyu Xiang. Pedestrian Navigation Data Modelling for HybridTravel Patterns, Geoinformatics 2008. Proc. SPIE, 7144, 71442Y(2008)[7] Zheng Pan, Yuefeng Liu, Adam C. Winstanley, Lei Yan, JianghuaZheng. A 2-D ESPO Algorithm and Its Application In PedestrianPath Planning Considering Human Behavior, <strong>Proceedings</strong> of MUE’09(to be published in June <strong>2009</strong>), IEEE CS.[8] Jianghua Zheng, Adam Winstanley, Zheng Pan, Seamus Coveney.Spatial Characteristics of Walking Areas for Pedestrian Navigation,<strong>Proceedings</strong> of MUE’09 (to be published in June <strong>2009</strong>), IEEE CS.Figure 4 3D data sample22


Wiimote as a Navigation tool for PedestriansRicky Jacob, Adam Winstanley, Declan MeenaghComputer Science DepartmentNational University of Ireland MaynoothMaynooth, Kildare, Irelandrjacob@cs.nuim.ie, adam.winstanley@nuim.ieEoin Mac AoidhNational Centre for GeocomputationNational University of Ireland MaynoothMaynooth, Kildare, Irelandeoin.macaoidh@nuim.ieAbstract — Pedestrian Navigation requires effectivecommunication between the mobile device and the user. TheMobile device should be able to give feedback to the userensuring there is minimum input and attention of the userrequired. Mobile interaction for navigation has mostly beenthrough the use of visual interfaces with maps and annotations.This paper describes a Multi-modal, haptic interface with thevisual interface of an OpenStreetMap on a mobile platform. TheWiimote is used as the haptic tool which will vibrate based on thenavigation path to be taken by the user with signals from themobile application via a wireless connection using bluetooth.Keywords- Multi-Modal, Haptic, OpenStreetMap, Wiimote,Pedestrian navigationI. INTRODUCTIONAccording to a new study from ABI Research [5] the numberof subscribers to handset-hosted location based services (LBS)increased in 2008 to more than 18 million. Thus we can seean enormous growth during the coming years in the use ofmobile services. For good Location Based Services (LBS),there is always the need of an effective medium of interactionbetween the human and the mobile device. The various waysin which we can achieve that can roughly be classified underthe following: graphical user interfaces (GUI), speech userinterfaces, haptic user interfaces, gaze interfaces, andcomputer vision.When selecting an interaction technique, we must know whatis the task that we are looking to make easier and more naturalfor the user and how do we plan to do that decides upon whichmodalities we are going to use for input and output.Part of this is- graphic feedback, haptic feedback, auditory feedback,speech synthesis- manual (haptic) input, speech input, voice input, eyetracking, computer vision.If multimodal interaction is going to be used, then how do weovercome the complexities that come with it? The need forpsychological aspects is required inorder to assess which ofthe methods are easy and hard for understanding and whichpeople prefer using. We must select the programmingplatform and additional software packages based on themodalities we choose for our work. Lots of work to make thebrowser a location-aware one is going on and with theGeolocation API, the browser will now be able to recognizeyou current location and give you search results based on that.This will cut down on the human interaction part. Mozilla [6]has recently included a new experimental add-on called Geodeto their famous web browser - Firefox. Geode provides arudimentary implementation of geolocation for the currentversion of Firefox by using a single hard-coded locationprovider to enable Wifi-based positioning conforming to theW3C Geolocation specification [7] so that developers canbegin experimenting with enabling location-awareapplications today.Since mobile devices have issues such as small memory, smalldisplay, latency, and user input constraints, we need to ensurethe application does not overlook those factors. Also the needto choose between broswer based and widget basedapplication is of importance. The use of graphical display likemaps or texual description, requires the user to look into thescreen while he is on the move thus interupting the user’sother major work. Use of audio feedback also requires theusers attention at all times just limiting the user fromperforming other tasks at the same time.Car navigation systems have evolved well over the years toprovide better communication between the user and thesystem. These navigation systems work better with the GPSsatelliites as the cars travel on roads and it is quite easy to getgood signals from the satellites. Also in case of the car, theuser’s context does not vary also his field of view, thus it iscomparitively easier to provide effective navigation to theuser. The pedestrian navigation system is much morecomplex, as user context various quite rapidly and the systemshould dynamically change accordingly. Also it requires theuser’s attention if the system is a visual interface or audiofeedback. Haptics on the other hand will not need much of theuser’s attention when on the move. In this paper. we proposethe need for haptic feedback for pedestrian navigation alongwith the visual interface on the mobile device.II. NAVIGATION USING HAPTIC FEEDBACKThe word haptics refers to the capability to sense a natural orsynthetic mechanical environment through touch. Haptics alsoincludes kinesthesia (or proprioception), the ability to perceive23


one’s body position, movement and weight (Hayward et al.[1]).An exploration of haptic output for indoors pedestrianguidance (e.g., a room in a hospital) which is whatGentleGuide [2] does is limited to indoor navigation and hasfixed paths. Most of the other work done with respect tohaptics is focuses on people with a visual disability, like thewearable navigation system by Ertan et al. [3]. Our work is tomaximize use the sense of touch efficiently to help usersnavigate in an outdoor environment, similar to Keyson [4].So here we are designing for various contexts and for a broadrange of users. We also keep in mind various other aspectswhile we model the system which includes the user’s level ofexperience, cultural background, physical abilities, cognitiveabilities, work domain, including others. We also plan tomodel the location or travel history and then try to providecontext aware results based on digital dairy/calendar inputs forreminders and also current weather and other environmentalinformation too.III. SYSTEM IMPLEMENTATION1.a) Mobile Device with GPS and Digital Compass1.b) Controlleron a map in the middle of his/her navigation, he can switch tothe map mode.Our haptic feedback system (Fig 1) allows the user to keep hismobile device (Fig 1.a) inside his pocket or bag and with thehelp of a Bluetooth connection to a Wiimote (Fig 1.c) or withthe use of the mobile phone itself, we can guide the user to hislocation with the help of haptic feedback in the form ofvibrations. We need a controller (Fig 1.b) to decode theencoded signals that is received by the Wiimote to interpretthe signals. The user will be guided toward the right path tohis location by altering the intensity and/or time of duration ofthe vibrations. If the user changes his position to a entirelydifferent location from the specified path (e.g., if it suddenlyrains, the user might have ran towards a building for shelter),then we also plan to dynamically re-route his/her path andprovide the best route to the user based on his/her currentposition.IV. CONCLUSIONGetting the mobile device to interact wirelessly with theWiimote was the first challenge and now we need tounderstand as to how usable the Wiimote is going to be. Forthis we are doing some user trials where paths have beenpredefined and we ask the user to reach the destination basedon the haptic feedback and also by using the visual navigationand take feedback to see how effective the haptic feedback is.We intend to use OpenStreetMap as the visual interface as it isfreely available for download and we can customize it as perour requirements. During the initial stages we will use thedata for Ireland which we have downloaded as our test bed.Haptic feedback will be useful as it enables the user to beinvolved in other work and would not need his/her attention asin the case of audio or visual feedback.1.c) Haptic DeviceFigure 1. Flow of information in the haptic feedbacksystem.The digital compass along with the GPS embedded in themobile will let the user navigate in open space. We alsoattempt to provide the best modality combination based on thecurrent environment. So if it the user wants to view the pathACKNOWLEDGMENTResearch presented in this paper was funded by a StrategicResearch Cluster grant (07/SRC/I1168) by Science FoundationIreland under the National Development Plan. The authorsgratefully acknowledge this support.REFERENCES[1] V. Hayward OR. Astley, M. Cruz-Hernandez, D. Grant,G. Robles-De-La-Torre, “Haptic interfaces and devices”,Sensor Review 24(1), pp 16-29, 2004.[2] S. Bosman, et al., “GentleGuide: An exploration of hapticoutput for indoors pedestrian guidance”, 2003.[3] S. Ertan, C. Lee, A. Willets, H. Tan, A. Pentland, “AWearable Haptic Navigation Guidance System”, 1998[4] D.V. Keyson, “Touch in User Interface Navigation”, 1997.[5] ABI Research http://www.abiresearch.com/, <strong>2009</strong>[6] Mozilla http://www.mozilla.com/, <strong>2009</strong>[7] W3C Geolocation http://www.w3.org/TR/geolocation-API, <strong>2009</strong>24


Feedback Control Models and Their Applicationin Pedestrian Navigation SystemsLei Yan 1 Zheng Pan 1,3 Adam C. Winstanley 2 A. Stewart Fotheringham 3 Jianghua Zheng 21. Beijing Key Lab of Spatial Information Integration & Its Applications, Institute of RS&GIS, Peking University, Beijing, China2. Department of Computer Science, National University of Ireland, Maynooth, Co. Kildare, Ireland3. National Centre for Geocomputation, National University of Ireland, Maynooth, Co. Kildare, IrelandAbstract—Feedback control theory has been widelyused in many fields; this paper introduces this theoryinto a model for a pedestrian navigation system. Basedon the model, several feedback channels are designedand analysed using control theory. The pedestrian is notonly a data receiver but also a data collector. Allcollected information is stored in a temporal databaseand can be used for spatial-temporal analysis. At thesame time, feedback control theory can integrate allmodules of the system as whole, which can help toimprove the overall effectiveness. Based on informationfed back and feedback control theory, the pedestriannavigation system will help users to “see more,understand better, and decide more quickly.”Keywords—Feedback control model, pedestriannavigation system, LBS (Location based services)I. INTRODUCTIONWith the advancement in communication, GPS andGIS technologies, vehicle navigation systems are nowwidely used. In recent years, pedestrian navigationsystems have been increasingly an active research field.It is not always easy for pedestrians to find the rightway to reach their destination in an unfamiliarenvironment. Pedestrian navigation systems can helpthem to find an optimal route. Navigation forpedestrians is different from that for vehicles. Differentpedestrians may choose totally different routes. Theycan walk along a road, or across a square. They canalso follow a twisting narrow pass, or even cross amuddy grass field. The pedestrian needs a highlyefficient and individualized navigation service.Furthermore, the pedestrian may also want to know thecurrent traffic situation before they decide which meansof transportation to use.There are many pedestrian navigation systemsavailable, but most of their implementation focuses onhow to accurately determine the user’s location or howto add more functional modules to the system. However,improving the overall effectiveness between severalcollaborating modules is still a problem that needsfurther study.Some pedestrian navigation systems can collect andprocess information from different sources. Thisinformation can help users to choose a more suitableroute. However, this data is usually considered transientand discarded after a short time. Archiving data forprofiling user preferences or route information has notbeen used very much.In this paper, we focus on the task of how toimprove the efficiency of pedestrian navigation systems.In section 2, we briefly introduce related work in thefield of pedestrian navigation and feedback controltheory. In section 3, we describe why feedback controltheory can be usefully applied in pedestrian navigationsystems and how it may be modelled. The paper closeswith some concluding remarks and suggestions ofenhancements for a practical pedestrian navigationsystem.II.RELATED WORKA. Pedestrian Navigation SystemsPedestrian navigation systems have been discussedfor a long time. However, despite the fact that manynavigation solutions include a pedestrian mode setting,none of these were worthy of that name [1]. Maybe thisis an extreme verdict on the available applications forpedestrian navigation. However, it reflects that therestill is a lot to do in this key area of location basedservices. Recent advances in mobile phones,positioning technologies, and wireless networkinginfrastructures are making it possible to implement andoperate large-scale pedestrian navigation in the real25


world and will result in them being more widelyavailable in the future. Among them, NAVITIME isprobably the most successful commercial navigationservice for the public in Japan. Pedestrian navigation isone of its main services. It became available early thiscentury and now has over one million users in Japan [2].As a whole, pedestrian navigation is still on theedge of new positioning and communicationtechnologies and several of its requirements have notbeen fully achieved. These include modelling the fullcomplexity of walking routes, indoor navigation,personalized services suitable for various pedestrianbehaviours and not implementing full 3D applications[3][4][5].With the advent of more powerful hardware andsoftware technologies in several key areas, includingcommunication, computing, positioning and spatialrepresentation, and more friendly and powerfulpedestrian navigation services will become a reality.B. Feedback Control TheoryControl theory is an interdisciplinary branch ofengineering and mathematics, which deals with thebehaviour of dynamical systems. The desired output ofa system is called the reference. When one or moreoutput variables of a system need to follow a certainreference over time, a controller manipulates the inputsto a system to obtain the desired effect on the output ofthe system [6] [7].To avoid the problems of the open-loop controller,control theory introduces feedback. A closed-loopcontroller uses feedback to control states or outputs of adynamical system. Its name comes from the informationpath in the system: process inputs (e.g. voltage appliedto an electric motor) have an effect on the processoutputs (e.g. velocity or torque of the motor), which ismeasured with sensors and processed by the controller;the result (the control signal) is used as input to theprocess, closing the loop.Feedback is a mechanism, process or signal that islooped back to control a system within itself. Such aloop is called a feedback loop. Intuitively many systemshave an obvious input and output; feeding back part ofthe output so as to increase the input is positivefeedback; feeding back part of the output in such a wayas to partially oppose the input is negative feedback.Negative feedback helps to maintain stability in asystem in spite of external changes. Positive feedbackamplifies possibilities of divergences (evolution,change of goals); it is the condition to change, evolution,growth; it gives the system the ability to access newpoints of equilibrium [8][9].Feedback control theory is widely used in manyfields, such as biology, climate science, economics,electronic engineering and mechanical engineering. Butit is seldom used in navigation field, especially inpedestrian navigation system.III.MODELING OF PEDESTRIAN NAVIGATION SYSTEMThe most important factor in a pedestriannavigation system is data. The system needs to get datafrom different sources and send processed data to users.Existing systems always take the pedestrian as aninformation receiver, which are typical open-loopsystems. In this paper, the pedestrian is not only a datareceiver, but also a data collector. The pedestrian canfeed some information back to the system, which formsa closed-loop of information.Fig. 1 is the feedback control model of pedestriannavigation system. As shown in the figure, there arethree feedback channels in the system: user feedback,spatial-temporal analysis feedback and validityfeedback. These three feedback channels all arenegative feedback, which can help to maintain systemstability.Fig. 1 Feedback control model26


A. User FeedbackBecause pedestrians are not limited in vehicle roadnetwork, they can walk across a square, or across agrass field. Therefore, compared with vehiclenavigation systems, pedestrian navigation needs moreinformation, including road network information, trafficinformation, POI (point of interest) information andsome other information. One solution is to add moreinformation source to pedestrian navigation system. Inthis system, the pedestrian is taken as an extra datasource.There are some data collecting interfaces in thesystem, so pedestrian can feedback data to the system intime. For example, if a pedestrian gets caught in atraffic jam, he can upload the congestion information tothe pedestrian navigation system, which can help otherpeople to avoid heavy traffic. For another example, if apedestrian finds a new shop when he is walking along astreet, he can send the name of the shop to the system.The exact location of the shop will be sent to the systemat the same time. In this way, a pedestrian navigationsystem can get more general and timely information.B. Spatial-temporal Analysis FeedbackPedestrian navigation systems can get informationfrom different data sources. Most systems use thesedata only once. In fact, we can find much usefulinformation by analyzing historic data, which is anotherkind of feedback channel of the system.A pedestrian navigation system needs to storeseveral kinds of information. This information oftenevolves with time and location. So a temporal databasewill be constructed to store massive historic data. Atemporal database supports the storage and retrieval oftemporal data objects. Each entry in the database hasone or more associated timestamps. All changes thatoccur are recorded and past states of the database maybe retrieved. Two different timestamps may be used,the transaction time, corresponding to the moment whenthe information is introduced in the database, and thevalidity time, the time when that information is valid inthe real world.By analysing the spatial-temporal data stored indatabase, pedestrian navigation system can not onlyreconstruct the missing data, but also predict future data.For example, real-time traffic information can helppedestrian to choose a suitable route. If the trafficinformation of some road segments is missing for somereason, we can reconstruct the missing data byanalyzing historic temporal data. Results of spatialtemporalanalysis can be feed back to the system as akind of data source.C. Validity FeedbackDifferent user may choose different routes. Somepedestrians want to choose a shortest route or a fastestpath. While other pedestrians may want to choose aneasy-walk route. The system will calculate a routeaccording to the user’s different requests.But can the calculated results truly meet the needsof users? A sidewalk may be no longer a fastest pathbecause of road-works. And a user may not think a pathis easy to walk because of the mud after a rain storm.So it is very important to get the feedback from usersabout system validity.In the process of navigation, the pedestrian canevaluate the precision of the data they receive and thevalidity of the calculated path. For example, each roadsegment will be assigned a number to represent theease-of-travel over the road segment in advance. Thesmaller the number is, the easier the road segment is towalk. If a pedestrian thinks some road segment is noteasy to walk, the system will increase the number afterit gets the feedback from user. In this way, the systemcan obtain user’s evaluation in time, which is also akind of useful feedback to improve the data processingefficiency.IV.CONCLUSIONThis paper attempts to apply feedback controltheory to pedestrian navigation. Three kinds offeedback channels are discussed. User feedback can beused to collect more information. Spatial-temporalanalysis feedback can be used to reconstruct missingdata and predict future data changes. And validityfeedback is useful to improve system efficiency.Furthermore, these feedback channels connect differentparts of the system into a unit. Based on these feedbackchannels, pedestrian navigation system will becomemore practical, more efficient and timely.REFERENCES[1] Dominique Bonte. The Mobile World Congress2008: Pedestrian Navigation at Last, 11 Feb. 2008.[2] Masatoshi Arikawa, Shin’ichi Konomi andKeisuke Ohnishi. NAVITIME: SupportingPedestrian Navigation in the Real World.PERVASIVE computing, Published by the IEEEComputer Society, July–September 2007, pp21-29[3] Masakatsu Kourogi,Nobuchika Sakata,TakashiOkuma,and Takeshi Kurata. Indoor/OutdoorPedestrian Navigation with an Embedded27


GPS/RFID/Self-contained Sensor System, ICAT2006,LNCS 4282,pp.1310–1321,2006.[4] Tracy Ross, Andrew May, and Simon Thompson.The Use of Landmarks in Pedestrian NavigationInstructions and the Effects of Context,MobileHCI 2004,LNCS 3160,pp.300–304,2004[5] Christian Gaisbauer and Andrew U.Frank.Wayfinding Model for Pedestrian Navigation.Proc. Of 11th AGILE International Conference onGeographic Information Science 2008.[6] Hao wang, Changcheng Huang, James Yan. AFeedback Control Model for Multiple-LinkAdaptive Bandwidth Provisioning Systems. 2006IEEE International Conference onCommunications, Volume 3, June 2006, pp.987-993.[7] Yu Chen, Qionghai Dai. A feedback controlmodel for resource management on streamingmedia servers. Video Image Processing andMultimedia Communications, 2003, 4 th EURASIPConference, Volume 2, 2-5 July 2003 pp.835-840.[8] Jeongho Hong, James C. Akers, RavinderVenugopal, Miin-Nan Lee, etc. Modeling,Identification, and Feedback Control of Noise inan Acoustic Duct. Control Systems Technology,IEEE Transactions on Control SystemsTechnology, Volume 4, No. 3, May 1996, pp.283-291.[9] B.G. Kim. Theoretic Framework for FeedbackControl Models in High-Speed Networks.Database and Expert Systems Applications, 2003.<strong>Proceedings</strong>. 1-5 Sept. 2003. pp.134-138.28


Tram and Bus Tracker: A Dynamic Web Application for Public TransitReliabilityBashir Shalaik and Adam C. WinstanleyDepartment of Computer Science, National University of Ireland, Maynooth, Co. Kildare, Irelandbsalaik@cs.nuim.ieAbstract—Currently transit quality informationsuch as timetable adherence, bus arrival times androute performance has usually been disseminatedthrough static environments on web-pages, paperdocuments or other different media. This paperdescribes a dynamic Geographic Information SystembasedWeb application which displays the sameinformation through a dynamic web application. Usingdata collected from an Automatic Vehicle LocationSystem (AVL), a map-based interface has been createdto allow travellers and operators to see routes, stopsand buses in motion. The collected information isarchived for off-line analysis. The system allows usersto query and display day-to-day management ofoperations as well as to generate static performancereports to provide a complete view of the transit systemreliability.I. I INTRODUCTIONEnvironmental occurrences such as trafficcongestion or urban construction present new obstaclesfor transportation. Designing and implementation of adistributed transit vehicle information system will helpreduce stress as well as improve confidence in andperception of transit systems [1]. Tram and BusTracker delivers real-time transit vehicle location andprogress information highlighting any deviations fromthe published time-table or any bad services symptomssuch as bunching of vehicles due to traffic conditionsor incidents. Recent advanced techniques incommunications, computing technology have madereal-time transit information system an interesting areaof research. In this project a web based applicationsystem has been developed to explore techniques ofshowing transit vehicle performance. This system wasbuilt using the PHP scripting language, a MySQLdatabase, client-side JavaScript, XML and MicrosoftVirtual Earth API.This system has the ability to display transit vehicleslocations in near-real-time on a map and offers user andoperator interactive querying on a specific bus or route.In addition, it allows the operator to monitor andmeasure the vehicle fleet so as to improve the transitservices provided.II TRANSIT SYSTEM’S RELIABILITY INDICATORSImproving the reliability of services is one of themain objectives of transit agencies. Many performanceindicators have been developed to assess transitperformance, the choice of which indicator dependingon the frequency of the service. For high-frequencyroutes (a vehicle at least every ten minutes) theseinclude Excess Waiting Time (EWT) [2], HeadwayRegularity Index (R)[3], Headway Reliability (RH)[4]and Travel Time Reliability (RT)[4]. On highfrequency routes passengers are more concerned withregularity, whereas for low frequency their concern ismore with punctuality.EWT [2] is defined as “the measure of the additionalwait experienced by passengers due to irregularspacing of buses or those that failed to run”. EWT canbe calculated by subtracting scheduled waiting time(SWT) from average waiting time (AWT) i.e.EWT = AWT − SWT (1)The term headway is used for the time intervalbetween successive vehicles on the same route and inthe same direction as they pass a particular point onthat route [5].The headway regularity index is a reliabilityperformance indicator for buses at a stop, route, orsystem level [3]. Service regularity is measured bycomparing the actual with the scheduled headway. Ahigh headway index (R) indicates a regular servicewhereas low numbers indicate headway irregularities.R =− 2n∑r=1hnr(− H )r2* H1 (2)where:r rank of headway (1..n)n total number of headway measureshr series of headwaysH mean headway29


When the headway measures are equal for nobservations the headway regularity index will be 1. Inthis paper the headway regularity index was calculatedfor one bus route in both directions.Many factors such as traffic conditions, routecharacteristics, passenger characteristics, andoperational conditions contribute to bus unreliability.The term reliability can be defined as “the ability of theservice to provide consistent service over a period oftime” [5]. In this paper two types of bus reliability aremeasured, travel time reliability and headwayreliability.Travel time reliability measures the variability in busjourney time for a specific bus route within a specifictime interval at a specific level of service [4]. Thetravel time reliability is defined as the mean overstandard deviation of travel time.µtRT =σt(3)The higher value of RT indicates a good reliability.Maintaining the scheduled headways by keepingregular spacing between buses will minimise theaverage passenger wait time and eliminates busesbunching. Bunching of buses happens in the absence ofservice during the scheduled time causing highpassenger demand for the next bus. HeadwayReliability is another proposed reliability indicator; it isdefined as the standard deviation over mean headway:Fig (1) The public interface showing updating textualdisplay plus moving locations onMicrosoft Virtual Earth.In order to improve services, as well as providingreal-time information, this system builds up an archiveof data that can be analysed and mined for informationthat can show behaviour of the transport system overtime, indicating problems such as vehicle bunching anddelays due to congestion. In addition, to qualify forpublic subsidies, operators must report Quality ofService metrics to government. These are usuallycalculated manually but the existence of a full archiveof data gives the potential for automation.σRH = t(4)µSmaller values of RH indicate better headwayreliability.tIII TRAM AND BUS TRACKERTram and Bus Tracker (www.bustracking.co.uk) is ajoint project between NUIM and Blackpool Transportthat uses various reliability measures to visualize thebehaviour of vehicles in ways to allow the operator tobetter assess and improve the quality of service. Thesystem uses off-the-shelf GPS/GPRS integrated unitsprogrammed to transmit location at regular intervals(45 seconds approximately) while the vehicle is inmotion. The data is stored on a server and can bedisplayed through a standard web browser to showviews representing current locations of vehicles inclose-to-real-time. The system displays real timelocations of buses pictorially, textually and, using thefacilities provided by the Microsoft Virtual Earth API,with 2D and 3D maps(figure 1).Fig (2) Route 5 in Blackpool cityIV. BLACKPOOL ROUTE 5The City of Blackpool, UK lies along the coast ofthe Irish Sea. It has a population of 142,900, making itthe fourth-largest settlement in North West England.The bus services in the city are operated by BlackpoolTransport Services. For the purpose of demonstratingthe analysis and evaluation of bus services inBlackpool, Line 5, a high frequency route, was selectedto be a test case. This bus route contains 73 bus stopsin both directions, 14 of which are timing-points where30


departure times are quoted in the public timetable.Figure 2 shows Line 5 in Blackpool.V. IMPLEMENTATIONExcess Waiting Time (EWT) is a standard metricused to measure the quality of service on highfrequencypublic transport. This indicator is a keyperformance indicator since it denotes how much timepassengers had actually to wait in excess of what theywould have expected if the service were perfect. EWTis calculated by subtracting Scheduled Waiting Time(SWT) from Average Waiting Time (AWT) and it isthis which is used as the measure of reliability. Thegreater the EWT, the less reliable is the service [2].EWT can be calculated on a daily, weekly or monthlybasis. Figure 3 shows EWT values in a certain day forall bus stops along route 5 in Blackpool.Fig (4) Headway regularity index valuesIt shows that there is a trend towards higherheadway regularity for all bus stops on the day theheadway index was calculated. Starting from the busterminus, travel time reliability was measured for eachbus stop in the same direction; figure 5 shows differentbus stops with travel time reliability measure.Fig (3) EWT values for different bus stops in route 5Buses bunching and headway overlapping can easilybe noticed. The headway regularity index indicator wascalculated to measure the quality of service; the resultis shown in figure 4. High index numbers indicate aregular service whereas low numbers indicate headwayirregularities.Fig (5) Bus stops with travel time reliability measure.Bus stops located on the south direction have highertravel time reliability than stops on the north, whichmeans that they have a more reliable travel time.For headway reliability, a smaller values indicatesbetter reliability. Headway reliability values seem to behigh on the second segment of the route and tend to belower before the end of the route (figure 6).31


[4] R.Liu, and S. Sinha, “Modelling Urban Bus Service andPassenger Reliability”, Institute for Transport Studies,University of Leeds[5] S.Zolfaghari, M. Y.Jaber and N. Azizi, “A Multi-AttributePerformance Measurement Model for Advanced PublicTransit Systems “,Journal of Intelligent TransportationSystems,7:3, 295-314, July 01 2002Fig (6) Bus stops with headway reliability measuresVI CONCLUSION AND FUTURE WORKTram and Bus Tracker is adynamic web applicationsystem with an intuitive interface and the ability tomeasure transit services’ reliability. The system can beuseful to users (travellers) and the operators in that, aswell as providing transit information in close to realtime,it also provides tools to analyse performance andsee where improvements are needed. The systemarchitecture can be developed to be applicable todifferent transit services or regions.VII ACKNOWLEDGEMENTSThanks are due to Blackpool Transport Ltd forfacilitating this project and particularly to OliverHowarth for his comments and feedback. One author issupported by a PhD studentship from the LibyanMinistry of Education.REFERENCES[1] D. S. Maclean, and D.J. Dailey, “Busview: A GraphicalTransit Information System”, <strong>Proceedings</strong> of the IEEEConference on Intelligent Transportation Systems,Oakland (CA) USA, August 25-29, 2001[2] “Performance Information London Bus Services Ltd.”London Bus Quality of Service Indicators, London,Fourth Quarter (2006/2007)[3] M. Hofman, and M O’Mahony, “The Impact of AdversWeather Conditions on Urban Bus Performance Measures”,<strong>Proceedings</strong> of the 8 th International IEEE Conference onIntelligent Transportation Systems, Vienna Austria,September 13-16, 200532


Section 1CSIGNAL PROCESSING33


Digital Audio Watermarking by Magnitude Modification ofFrequency Components Using the CSPE AlgorithmJian Wang #1 , Ron Healy #2 , Joseph Timoney #3# Computer Science Department,National University of Ireland, Maynooth, Co. Kildare, Ireland1jwang@cs.nuim.ie2ronhealyx@gmail.com3jtimoney@cs.nuim.ieABSTRACT:In this paper we describe a process whereby themagnitude of either one or two frequency components ofa signal is modified in order that it may be used toencode a hidden message within a signal in such a way asthe casual observer would have no way of noticing thepresence of a hidden message. Previous work has usedfiltering and signal addition to achieve the same goals.The current work improves on this by using a recentsuper-resolution component-identification technique toisolate the components to modify, limiting the impact onthe quality of the signal.Keywords: Signal processing, digital audio watermarking,data hiding, Steganography1.0 INTRODUCTIONThe concept of Steganography, defined as “the art orpractice of concealing a message, image, or file withinanother message, image, or file” [1] is not new.Steganography may be combined with Cryptography inorder to make message data more secure even if thepresence of the message is discovered. Digital watermarkingof audio and video is a form of Steganography, in that theaudio/video can be used to ‘hide’ the presence of otherinformation.In recent years there has been a marked increase in researchin the area of digital watermarking. This has been driven, inpart, by the needs of the Entertainment Industry to findmeans for protecting, tracking or identifying intellectualproperty such as photographs, music and movies. The SDMI(The Secure Digital Music Initiative, a group consisting ofmore than 200 companies in the fields of I.T., Music andEntertainment, Consumer Electronics, Security and InternetService Providers) challenge at the turn of the century, withregard to digital music, contributed to much investigationinto the area of digital watermarking over the interveningyears. Eventually, the SDMI folded, claiming that it wasawaiting developments in technology before implementingdigital rights management technologies. One of the reasonsidentified for the SDMI’s failure was that the technologiesthen available were insufficient to achieve the aim ofcompletely hiding an added watermark from those expert ortalented listeners described as ‘golden ears’. This meant thatthere was no way of preventing detection and ultimateremoval of the watermark. The watermarking technologythat the SDMI purported to recommend to the Industry wasbroken almost immediately [2].There have been a number of alternative propositions forhiding data in cover signals and most are successful to acertain extent or in a given context. A good overview of thetheories in this area can be found in [3]. The basic premiseof watermarking schemes is that the information to bewatermarked w is added or embedded in the cover or hostsignal s to produce a watermarked signal s’s + w = s’ (1)This paper proposes a technique for hiding data in coveraudio signals, specifically music or spoken word, by theidentification and modification of the magnitude offrequency components in the cover signal itself.In part, the work is inspired by [4], a technique designed forcovert communications across a radio channel for militaryapplications, and follows on from an earlier work whichused the addition of multiple frequency components toachieve a similar aim [5]. In [5] it was proposed that themessage to be embedded was to be separately generated.This was then added to the host or cover audio. In thispaper, however, we instead propose that the host or cover is34


itself modified in a controlled manner, rather than havingpotentially destructive and/or detectable content added to it.In both this paper and [5], the primary concern is forinaudibility of the watermark and blind or semi-blinddetection, meaning that the decoder does not have anyknowledge of either the content of the cover audio or of theembedded watermark prior to decoding. This restriction isguided by the intended use of the technology.In this paper, we present the results of experimentsperformed to recover a bit sequence which was embedded ina synthesised cover audio signal consisting of randomlygenerated components. The decoding was performedwithout any reference to the original unwatermarked signalor the watermark itself.2.0 METHODA component value is first chosen which is used as the basisfor calculating which components to modify to hide themessage. The initial component choice may be dependenton various factors, such as the type of audio used ashost/cover. For example, human speech generally consistsof lower frequency components – and less of them – than amodern Rock or Pop song so hiding data in a recording ofspeech would naturally limit the component of choice.However, even in such a limited range, there are stillthousands of values to choose from.The value of the chosen component becomes, in effect, aprivate key and this value is needed in order to decode thewatermark – assuming that the presence of the watermarkhas previously been detected. This adds to the security ofthe technique when used in an environment where securityof the content of the hidden message is an issue.The signal intended as the cover or host audio is segmentedinto frames of uniform length and the frame is then analysedusing ‘Complex Spectral Phase Estimation’ (CSPE)techniques [6] to identify the presence and magnitude of itsinherent components. Previously, FFT techniques have beenused to approximate the relative strengths of inherentcomponents. This would be inadequate for this project, asexact measurement of components using the FFT is onlypossible if the component is aligned with an analysis bin.This is an unlikely occurrence in a real-world signal such asrecorded music or speech. Therefore, the FFT is aninadequate solution to the problem of identifying exactly thecomponents present.2.1 CSPE INTRODUCTION AND DESCRIPTIONThe CSPE algorithm was introduced as a method toaccurately estimate the frequency of components that existwithin a short time frame. It was also designed to becomputationally efficient. It is actually related in someaspects to the cross-spectrogram technique of [7]. Theprincipal of CSPE algorithm can be described as follows:An FFT analysis is performed twice: firstly on the signal ofinterest and the second time upon the same signal butshifted in time by one sample. Then, by multiplying thesample-shifted FFT spectrum with the complex conjugate ofthe initial FFT spectrum, a frequency dependent function isformed from which the exact values of the frequencycomponents it contains can be detected. The procedure ofthe CSPE algorithm is depicted in block diagram form inFigure 1.s0WindowingFFTMultiply bin byAngleFrequencyEstimationShift 1sampleFig. 1: The flow diagram of CSPEMathematically, the algorithm can be described as follows.Assume a real signal s 0 , and a one-sample shifted version ofthis signal s 1 . Say that its frequency is = q + where q isan integer and is a fractional number. If b is an initialphase, w n is the window function used in the FFT,Fws 0the windowed Fourier transform of s0, and Fws1is thewindowed Fourier transform of s 1 , then, from [6], we finds1WindowingFFTConjugateis35


j 2 πβND = e(2)The frequency dependent CSPE function can be written asCSPE = Fwws0F*ws1 * n2D Fw( D )a 2= ( ) +2Re2 −n+ D F ( )wDj2bn * −n{ e DF ( D ) ⊗ F ( D )}The windowed transform requires multiplication of the timedomain data by the analysis widow, and thus the resultingtransform is the convolution of the transform of the windowfunction, w f , with the transform of a complex sinusoid.Since the transform of a complex sinusoid is a pair of deltafunctions in the positive and negative frequency positions,the result of the convolution is merely a frequencytranslatedcopy of w f centred at + and -. Consequently,with a standard windowing function, the ||F w (D n )|| term isonly considerable when k and it decays rapidly when kis far from . Therefore, the analysis window must bechosen carefully so that it decays rapidly to minimize anyspectral leakage into adjacent bins. If this is so it will renderthe interference terms, i.e. the second and third terms, to benegligible in Eq.(3). Thus, the CSPE for the positivefrequencies gives:w2w(3)2an2−1CSPEw ≈ Fw( D ) D(4)4From Eq. (4). we find the CSPE frequency estimate− N∠(CSPEw)f =2π2 an2− N∠ Fw( D ) D= 42πCSPE w(5)2a− N∠(= 4F ( Dw2πn)−12π2 − j βNe2π) − N ( − β )= N = β2πThe frequency dependent function as illustrated in Equation(4) produces a graph with a staircase-like appearance wherethe flat parts of the graph indicate the exact frequencies ofthe components. The width of the flat parts is dependent onthe main-lobe width of window function used to select thesignal before FFT processing. An example of the output ofthe CSPE algorithm is shown in Figure 2. Consider thesignal S 1 which contains components with frequency values(in Hz) of 17, 293.5, 313.9, 204.6, 153.7, 378 and 423. Thesampling frequency is 1024 HZ. A frame of 1024 samples inlength is windowed using a Blackman window and ispadded using 1024 zeros. The frequency dependent CSPEfunction is computed as per Equation (5). As shown inFigure 2, each component can be calculated and these areidentified with an arrow in the graph. The largest erroramong all the estimates of the components frequencies isapproximately 0.15 Hz.Frequency value50045040035030025020015010050Frequency estimation by improved CSPE00 100 200 300 400 500 600 700 800 900 1000bin indexFig. 2 Frequency estimation of S 1 by CSPENotice too in Figure 2 that at the flat sections in the graph ofthe CSPE result, the width of flat sections where the arrowspoint are related to the width of the window’s main-lobe inthe frequency domain.In addition, with CSPE, we can get the amplitude and phaseof the kth frequency component using the followingequations, where W(-fcspe(k)) is the Fourier Transform ofwindow function which has been shifted to fcspe(k) infrequency domain.AmpPhasekk2 * Fw s0=W ( ω − fcspe(k)) 2 * Fw s0= ∠W( ω − fcspe(k))2.2 EXPERIMENTAL EVALUATION OF CSPEExperiments were designed to evaluate the performance ofthe CSPE algorithm in correctly identifying frequencycomponents within a multiple-component signal. In each setof experiments, a total of 500 signals with SamplingFrequency 44100 Hz and containing components across the(6)(7)36


human hearing range of 100 Hz to 20,000 Hz weregenerated. Each signal contained many equally spacedfrequency components. The number of components in eachgenerated signal was not consistent. For each individualsignal, we have a unique, randomly-generated step constantwhich defines the space between two neighbouringfrequency components of the signal. 500 step constants werecreated range from 169 Hz to 668 Hz for 500 signals.Equation (8) and (9) were designed to assess CSPE accuracyin frequency estimation.Denoting Freq estk as the value of estimated Frequencycomponents of signal k ; Freq orgk as the value of originalFrequency components of signal k; M k as the number offrequency components contained in Signal k; FreqError asthe frequency estimation error between Freq est and Freq orgof signal k; MeanError cspe as the mean error of the CSPEfrequency estimation over N signals, for this experiment, N= 500, M changes with signal step constant.FreqErrorM kFreq( i) − Freq ( i)estkorgki=k = 1 (8)M kThe frequency estimation error of each signal as computedusing Equation (8) is shown as Figure 3:Frequency Estimation Error in Hz0.80.70.60.50.40.30.20.100 50 100 150 200 250 300 350 400 450 500Signal IndexFig. 3 CSPE Estimation Error for Each SignalThe distribution of frequency estimation error (FreqError)is shown in Figure 4.Signal Counts5004504003503002502001501005000 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8Frequency Estimation ErrorFig. 4 The distribution of Frequency Estimation ErrorThe mean error is calculated according to Equation 9MeanErrorcspeNFreqErrorkk== 1 (9)NBy data analysis, we note that 97.8% of signals analysedusing the CSPE algorithm resulted in a FreqError value ofless than 0.1 Hz, and the MeanError cspe is 0.0174 Hz,meaning that the algorithm identified the component towithin 0.1 Hz in almost all cases. We conclude from theseresults that the CSPE is extremely accurate in frequencyestimation for signals containing constant frequency signalcomponents. With accurate estimation of the frequency, theamplitude and phase can be estimated using Eqs. (6) and (7).3.0 MODIFYING COMPONENTSOnce the user-defined component has been identified in thesignal by the CSPE algorithm, its magnitude is thencalculated. It is then a matter of modifying the magnitude ofthis component, weighting it against a second value fromwithin the signal, in order to represent a single bit ‘1’ or ‘0’.We may choose to weight the user-defined componentsagainst the average power of the frame in which the bit is tobe embedded. This was the procedure followed in both [4]and [5]. We may also choose to modify the user-definedcomponent against a second component. This method has itsadvantages and disadvantages but it is not our intention todetail the process in this paper. However, using a secondcomponent from within the signal as a comparison againstwhich the first user-defined component was weighted, led tosome problems in that, while the CSPE algorithm is very37


accurate in identifying the components in a synthesisedsignal with little variation, this may not be the same type ofcomponent make-up as would be encountered in real worldsignals, such as audio and speech.3.1 DYNAMICALLY SELECTING COMPONENTSWe decided to make the process of choosing thecomponent(s) to modify as flexible as possible by makingthis a dynamically chosen pair of values, dependent on theuser-defined value but also dependent on the signal underconsideration and reliant on the ability of the CSPEalgorithm to detect and identify the components that thewatermarking process would use. We defined thecomponents which would be chosen for modification asbeing the nearest components above and below the userdefinedvalue by more than a calculated threshold asillustrated in Equation (10) where compA is the highestCSPE-detected frequency component that is lower than theuser-defined component u, by more than the threshold kwhile compB is a CSPE-detected frequency componentabove the user-defined component by the same thresholdamount(compA < (u - k)) < u < (compB > (u + k)) (10)What is interesting to note, using the formula in Equation(10) for defining which component we need to modify, andin which frames of the cover signal, is that onlyapproximately half of the frames will require anymodification. This is because the relationship between thevalues of the two chosen components in any given framemay already fit the criteria used for representing a ‘1’ or a‘0’. In this case they would not have to be modified in anyway. This consideration makes this method far morefavourable than [5],When modifying the amplitude of a frequency component,care must be taken to ensure that we do not introduce anynoticeable artefact which would result in an impact onsound quality. Similarly, we must ensure that the alterationwe make to the magnitude of the chosen component is notso great as to have a negative impact on the timbre of theoriginal signal.We define a set of rules that would lead to the modificationof only one of the components (compA or compB) inapproximately half the frames. This is achieved by settingthe rule (Amp refers to Amplitude)If bit=1 let Amp(compA) > Amp (compB) + marginIf bit=0 let Amp (compB) > Amp(compA) + marginThe system would then compare the magnitude of bothcomponents (compA and compB) in any given frame beforedeciding if any modification would be required in order tosatisfy these criteria, depending on the bit to be embeddedand the magnitudes of the two components in that particularframe. If they are already in the correct relationship, nomodification is required. If, however, they are not in thecorrect relationship, we must modify at least one of them.The decision to modify a component leads another question.Let us assume that the magnitude of compA is lower thanthat of compB, in a frame in which it needs to be of a highermagnitude to represent a ‘1’ bit.3.2 MODIFYING THE MAGNITUDEAs mentioned in Section 2.0.2, the CSPE algorithm can beused to accurately identify a component within a signal, andthen to calculate its phase and amplitude. In order toincrease the magnitude of a particular frequency componentin the cover signal S(t), we add a component at a definedmagnitude and matched to the phase of the component it isbeing combined with, as illustrated in Equation (11):S(t)= S(t)+(rAmp-lAmp+threshold)cos(2(compA)t+lp) (11)where rAmp, lAmp, compA and lp define amplitude ofcompB, amplitude of compA, Frequency of compA, phaseof compA.Similarly, if we decide to reduce the magnitude of acomponent S(t) so that it satisfies the requirements forembedding a ‘1’ bit, we do this by reducing the magnitudeof the component to the right of the user-defined componentovalue, by adding in a component that is 180 out of phasewith the original component in the signal as follows:S(t)= S(t)+(rAmp-lAmp+threshold)cos(2(compB)t+ -rp) (12)where compB and rp define amplitude and phase of compB.4.0 DECODINGIn order to process candidate audio for detection anddecoding of a potential embedded watermarked message,the system must first be provided with the user-defined38


value used as a basis for calculating the embedding values,along with the rules that define a ‘1’ bit and a ‘0’ bit. Thecandidate audio signal is then segmented into frames usingthe same frame size as was used for embedding. The systemcalculates the magnitude of the embedded component, andperforms a simple comparison. From this comparison thewatermarked bit sequence can be recreated. It would be acomparatively simple matter of applying the CSPEalgorithm to identify the two components above and belowthe user-defined value by more than a pre-defined threshold.These two components would then have their magnitudecompared and a ‘1’ or a ‘0’ bit would be determinedaccording to the rules used in their embedding.Precision of Codec10.9980.9960.9940.9920.990.9880.9860 50 100 150 200 250 300 350 400 450 500Signal IndexFig. 5 Precision of Codec for each Signal5.0 EVALUATION OF WATERMARKING SCHEMEA series of experiments was carried out to evaluate theperformance of this codec, based on the same 500 signals asintroduced in Section 2.2. For each signal, a randomlygenerated binary bit-sequence of length 150 was embeddedby means of modification of the magnitude of componentsas described in Section 3. The system then decoded themodified signal in order to detect the watermarked code.The difference between these two code sequences can becalculated in terms of equation below, where DCodedenotes code sequence obtained in decode side, ECodedenotes code sequence embedded in the signal.CodecPrecision denotes the precision of the decode processwith code length L for signal k, MeanPrecision denotesaverage error of the decode process over N signals. In thisexperiment, L and N are set to 150 and 500 respectively.The results of this experiment are depicted in Figure 5.L( ) − ECode( i)L − DCode ii=1CodecPrecisionk = (12)LN( CodecPrecisionk)k=MeanPrecision = 1 N(13)The distribution of CodecPrecision is shown in Figure 6.Signal Counts5004504003503002502001501005000.986 0.988 0.99 0.992 0.994 0.996 0.998 1Precision of CodecFig. 6 The distribution of Precision of CodecFrom the experiment results, it can be seen that 99.2% ofsignals produce a CodecPrecision value of 1 (100%). Thismeans that, from 500 randomly generated signals withmultiple components of different frequency spacing,watermarked with a binary bit-sequence of 150 bits, 99.2%of these signals were decoded to the exact 150 bit sequence.Only 0.8% (a total of 4) of the 500 signals was not decodedperfectly. Of those not perfectly decoded, the bit sequencerecovery rate was above 98.66%. The MeanPrecisioncomputed using Equation (13) is 0.9999 (99.99%).Therefore, the performance of this codec is almost perfectfor this experiment with the synthesised signals.39


Furthermore, the decode experiment in this case representeda single iteration of a bit sequence over the length of asignal. Given that any real world use of such a schemewould enable a bit sequence to be embedded repeatedly in acover signal, it would be possible to increase theeffectiveness of the decode process by, for example,repeated decoding and using the mode of the results.6.0 CONCLUSIONWe have proposed an application that utilises the superresolutioncapabilities of the CSPE algorithm to accuratelyidentify individual components of an audio signal, calculatetheir magnitudes and then alter magnitude as appropriate torepresent a particular bit value.Audio Engineering Society 36 th International Conference,Michigan, USA. June 2-4, <strong>2009</strong> (in press)[6] K. M. Short and R. A. Garcia, “Signal Analysis usingthe Complex Spectral Phase Evolution (CSPE) Method”,Audio Engineering Society 120 th Convention, May 2006,Paris, France[7] Douglas Nelson, “Cross Spectral Methods forProcessing Speech”, Journal of the Acoustic Society ofAmerica, vol. 110, No.5, pt.1, Nov.2001, pp.2575-2592[8] The online Webster Dictionary. http://www.websterdictionary.net/definition/interpolationExperimental tests using 500 synthesised signalsincorporating multiple randomly generated componentsembedded with a bit sequence of length 150 showed anaccuracy of completely perfect decoding of 99.2% with anaverage overall accuracy of 99.999%.Future work will determine how to calculate and set themagnitude so signal watermarking is perceptually invisible,by evaluating whether to modify the component to the leftor right of the user-defined frequency value, or both.Also, the impact of accidental and deliberate attacks on thewatermarked signal will be evaluated.7.0 REFERENCE[1] Merriam-Webster Online Dictionary.http://www.merriam-webster.com/dictionary/steganography[2] S. A. Craver, M. Wu, and B. Liu, “Reading between thelines: Lessons from the SDMI challenge,” in 10th USENIXSecurity Symposium. Washington, DC, 2001.[3] Moulin, P., & Koetter, R., “Data-Hiding Codes”, Proc.Of the IEEE, Vol. 93, No. 12, Dec. 2005.[4] Gopalan, K., et al, ‘Covert Speech Communication ViaCover Speech By Tone Insertion’, Proc. of the 2003 IEEEAerospace Conference, Big Sky, MT, March 2003.[5] Healy, R. & Timoney, J. ‘Digital Audio Watermarkingwith Semi-Blind Detection For In-Car Music Identification’40


USING CONVOLUTIVE NON-NEGATIVE MATRIX FACTORIZATIONALGORITHM TO PERFORM ACOUSTIC ECHO CANCELLATIONXIN ZHOU, YE LIANG, NIALL CAHILL, ROBERT LAWLORDepartment of Electronic Engineering, National University of Ireland Maynooth,Maynooth, Co. Kildare, Ireland.Email: zhou.xin@nuim.ie, liang.ye@nuim.ie, ncahill@eeng.nuim.ie, rlawlor@eeng.nuim.ieIn this paper Convolutive Non-negative Matrix Factorization (NMF) is used to perform acoustic echo cancellation. Thismodified version of NMF employs the idea of convolutive basis decomposition. CNMF makes use of a single basisfunction which spans the pattern length. This version of NMF allows us to reveal the underlying patterns which crossmultiple columns as single bases. We use this idea, realize it in the simulation environment and implement it to performAEC. This approach is evaluated through experiments on simulated data, from the results we find acoustic echo can bereduced using this approach.1. INTRODUCTIONIn recent years the use of hands free communication hasincreased significantly. A problem with such hands freecommunication is the occurrence of acoustic echo.Acoustic echo is an important issue for hands freecommunication. Acoustic echo exists in anycommunications scenario where there is a speaker and amicrophone, e.g. hands-free car phone, hands-freeconference phone, etc. Acoustic echo arises when soundfrom a loudspeaker is picked up by the microphone inthe same room. When the far-end user speaks to themicrophone, the microphone of the near-end user cancatch the output speech from the near-end speaker andretransmit back to the far-end user side. Usually theloudspeaker-enclosure-microphone coupling (LEM) ispresented as a time invariant linear FIR filter h(t),however as a small changes in the enclosureenvironment can greatly affect the LEM filter, such asopen or close a door, it is necessary to use an adaptiveLEM filter to model the echo path over time. Call thefar-end participant voice signal x(t), the near-end speechsignal v(t), and the noise n(t), then the echo signal y(t)can be stated as:N 1 (1)y( t) n( t) v( t) h( i) x( t i)i0Here N is the length of the impulse response and t is thetime index.Most existing AEC techniques use Least Mean Square(LMS) or Normalized LMS (NLMS) [1] to estimate andupdate the LEM filter coefficients. It works bycalculating an estimate of the acoustic echo based on thereference speech and the incoming captured speechsignal. This acoustic echo estimate is then subtractedfrom the near-end speech before being sent to the farenduser.There exist some problems to this approach: first of all,long estimation filters are needed for a LEM filter. Theresulting long impulse responses can lead toconvergence issues and large computation load.Secondly, the adaptive algorithm may diverge if thereference signal contains noise. Thirdly, changes in theLEM filter may lead to a period of misadjustment.Finally, the adaptive algorithm may diverge away fromsuitable FIR coefficients when both near-end and farendusers speak simultaneously [2].In order to overcome these problems, a number oftechniques have been developed [1]. However, theacoustic echo problem has not yet to be solvedcompletely. In this paper an alternative approach toAEC is presented. A monaural sound source separation(SSS) technique based on Convolutive non-negativematrix factorisation (CNMF) is employed to performAEC. It is shown that this approach can lead tosignificant echo reduction particularly duringdoubletalk.The structure of the paper is as follows. In section 2Monaural Sound Source Separation is described,sections 3 and 4 introduce the theory of NMF andCNMF. In section 5 we describe the experimentsmaking use of CNMF to perform AEC followed byresults and discussion in sections 6.2. MONAURAL SOUND SEPARATIONMonaural sound source separation is a technique thatattempts to separate any number of sound sources withonly one mixture of the sources. As only one mixture141


exists no spatial information of the sources is available.This means that blind source separation techniques suchas Independent Component Analysis (ICA) or spatialfiltering based on sensor arrays cannot be used as theyrequire multiple sensors. Similarly, undetermined blindsource separation techniques based on sparsity andspatial cues which require at least two mixtures such asDUET and ADRess are also not applicable [15].In many situations, such as telecommunications andaudio retrieval, only a monaural (one microphone)solution is available. The use of prior information aboutthe source signals makes the monaural sound sourceseparation possible. The idea of this technique is to trainbases or models using training data for each speakerbeforehand and then match these models with a mixturecontaining these speakers. Within this framework, manydifferent approaches have been developed, includingnon-negative matrix factorization (NMF) [3] or sparseNMF (SNMF) [4], markov models [5], local NMF [6]bases, and in [7] time domain bases were trained usingICA and then matched to the mixture using a maximumlikelihood technique.In this paper we present the application of such anapproach to the AEC problem. We made use ofConvolutive NMF to perform both training andmatching in the audio spectrogram.3. THE ORIGINAL NONNEGATIVE MATRIXFACTORIZATIONNon-Negative Matrix Factorization (NMF) is amathematical technique for linear non-negative data [8].The non-negative constraint leads to a part-baseddecomposed representation because it allows onlyadditive, not subtractive, combinations of the originaldata. The decomposition gives a more intuitiverepresentation of the underlying data [8]. The basic ideaof it is approximating a data set V ∈ R ≥ 0, M × N as amultiplication of two matrices W ∈ R ≥ 0, M × R and H∈ R ≥0, R× N:V ≈ W·H. (2)The non-negative constraint forces the factors W and Hto be non-negative, i.e., all elements must be equal to orgreater than zero.By changing R the degree of the approximation can bevaried. Big R leads to decreasing the reconstructionerror and small R increasing it.The next step is to estimate W and H which is anoptimization problem. Lee and Seung [9] defined twoapproaches for estimating W and H, based on differentcost functions. A generalized version of the Kullback-Leibler divergence is the cost function used in [2]:D( V || W, H) || V log( V ) V W H || Fro(3)WHWhere ⊙ is the Hadamard product. The purpose of thisoptimization is to minimize this cost function withrespect to W and H with the constraint of nonnegativity.The following update rules for calculating Wand H are derived from equation (3) [9]:T V V W HWH WHH H , W W TTW 1 1HT(4)The iterative procedure is halted when the cost functionD reaches a per-defined threshold.The matrices H and W express different aspects of thefactorization: the columns of W contain the basis for thedata and the rows of H contain the contribution of eachbasis to the data over time. When W and H aremultiplied together, the data is reconstructed with anerror. The magnitude of this error depends on R and thedata.4. CONVOLUTIVE NMFThe following theory of CNMF utilizes work from [10].The original NMF represents regularly repeatingpatterns which span multiple columns of V matrix usinga number of different bases to describe the entiresequence. CNMF uses a single basis function that spansthe pattern length. This kind of situation can be veryfrequently found when analyzing audio signals. Thisversion of NMF allows us to reveal the underlyingpatterns which cross multiple columns as single bases.By doing a series of experiments, we can find that thisapproach leads to a significant reduction in the level ofacoustic echo.As described in section 3 NMF using a matrix product V≈ W ·H to reconstruct the estimated data matrix V, inthe convolutive Non-Negative Matrix Factorization theyextend this expression to:T 1V W () t Ht0t (5)where V ∈ R≥0,M×N is the input data matrix, W(t) ∈R≥0,M×R is the training bases, and H ∈ R≥0, R×N isithe weights. The () operator is a shift operator thatmoves the columns of its argument by i elements to theright, and conversely () ishifts to the left. The columnsthat are shifted in from outside the matrix are set tozero.242


Equation 5 is a summation of convolution operationsbetween corresponding elements from a set of twodimensionalbases W and a set of weights H.The set of i th columns of W(t) defines a two-dimensionalstructure. This matrix will be shifted and scaled byconvolution across the axis of t with the i th row of H.The resulting reconstruction will be a summation of allthe basis convolution results for each of the R bases [9].The estimation of the appropriate set of matrices W(t)and H to approximate V is based on the framework ofNMF that Lee and Seung used in [9]. In accordance tothe NMF cost function, they defined the ConvolutiveNMF cost function as:VD || V In | V Vˆ ||Vˆ Where V ˆ is the approximation of V defined as:T 1Vˆ W () t Ht0tF(6) (7)They decomposed the above cost function to a series ofsimultaneous NMF approximations according to thelinearity property, one for each value of t. Then theyoptimize the above cost function by optimizing this setof T NMF approximations. For each NMFapproximation they updated the equivalent W(t) and theappropriately shifted H. This gives the convolutiveNMF updates equations which are:tT V VW () t HVˆ VˆH H , W ( t) W ( t) TTWt ( ) 1t1HT(8)They updated H and W(t) in every updating iterationand each t. Actually for each t, W(t) is updated by thecorresponding NMF, but H is shared and shifted acrossall t’s in an iteration. Update W(t) and H for each t mayresult in a mistaken estimate of H with the update for t= T −1 dominating over others. Therefore it is best toupdate all W(t) first and then assign to H the average ofall the NMF subproblems:tT VWt () VˆH H , tTWt ( ) 1(9)In terms of computational complexity this techniquedepends mostly on T. If T = 1 then it reduces tostandard NMF, otherwise it is burdened with extramatrix updates equivalent to one NMF per unit of T.In this paper we utilize this idea, realize it in the Matlabsimulation environment and implement it to perform thespecific application which is Acoustic EchoCancellation.To perform monaural sound source separation withCNMF, two stages are required. First, acquire asequence of spoken speech from each speaker,calculating a spectrogram for each sequence andperforming NMF decomposition on each spectrogramseparately. Then separate low rank W basis matrices aretrained for each individual speaker. The resultant Wmatrices (one for each speaker) are then concatenatedinto a large W matrix called W train . The second stage isthe separation stage or a matching stage where amixture of speech, containing known speakers, isseparated into individual sources. This is achieved byperforming CNMF decomposition on the speechmixture using W train from the training stage. In thisstage, this factorization W train is updated by convolutionwith the H matrix in each iteration. This process givesthe result that each individual speaker’s basis matrixcharacterizes the mixture spectral energy depending onthe contribution the speaker made to the mixture. Aftera prescribed number of iterations have been reached,W train is separated back to the individual W matrices ofthe speakers. The resultant V matrix is combined bymultiplying W train with the corresponding portion of theH matrix from the separation stage.5. EXPERIMENTSThe experiments in the paper demonstrate that theCNMF based approach to AEC can match and removethe echo with or without the presence of far-end speechin the near-end microphone signal. We use the conditionof conventional AEC that the far-end speaker speech isused to excite the LEM system at the near-end user, sofor this experiment we neglect the effect of noise on theoverall system, neither measurement nor backgroundnoise, then divide and perform the processing indifferent groups.We did the experiments using Matlab. We used purespeech from different speakers which were chosenrandomly from the TIMIT database [13]. We used somespeech as near-end and some as far-end, then addedroom response to the far-end speech to create the noise.After that we mixed these speech signals to create themixture. Each mixture contains both near-end speakercontribution and far-end speaker contribution (heretreated as noise).We trained a W basis matrix for the near-end speakerusing training data for each experiment. We also traineda separate specific W basis for use on the incoming farendspeech. The number of basis vectors R within Wwas set to 32 for both near-end and far-end speech bases.These bases were used to separate out the mixturespectrogram and remove the echo from the returningsignal. The time base for convolution was set to 8,which means the algorithm will read in 8 frames of data,perform shift of H and update of W(t) 8 times in each343


iteration, one NMF for each frame. This is because eachframe of sound signal has only one column. As a result,to process sound signal the CNMF doesn’t require extracomputational load. From the experiment we found thatestimating the contribution from the original mixtureresulted in better results. The results of this test arelisted as Output 1 in Table 1. We also performed anadditional experiment to evaluate this approach when noknowledge of the local speaker is available. Speechesfrom other speakers were used to train generic andindependent W basis and these bases were used toperform matching. The results of this experiment arelisted as Output 2 in table 1. The third experiment wasperformed to compare and illustrate the benefit ofhaving the reference signal in AEC, displayed as Output3 of Table 1. For this experiment speaker specific baseswere trained for near-end and far-end speakers withoutthe specific reference speech provided.We used both objective and subjective measurements toanalyze the results of the experiments. In the subjectivelistening tests, a panel of subjects listened to the inputand output speech to assess the effect of the algorithm.The objective analysis used three objective ratios basedon the input and output speech to analyze theperformance of the CNMF algorithm during doubletalk.Two of the three ratios were taken from a standardizedset of energy ratios defined in [14]. The first is thesignal to interference ratio (SIR), which measures theamount of echo still left in the returning near-endspeech,2|| s || targetSIR 10log10 2|| einterf|| (10)The second is signal to distortion ratio (SDR) whichmeasures the amount of the distortion in the originalsignal due to the algorithm2 || s || targetSDR 10log10 2|| einterf eartef|| (11)Where e interf is the amount of interference energy left inthe output, e artef is the energy of processing artefacts leftafter processing and s target the near-end speech. The thirdenergy ratio is a measure of the level of echosuppression, the echo reduction loss enhancement(ERLE) defined as follows2E{ y ( t)}ERLE 10log10 2 E{ e ( t)}(12)Where y(t) is the echo signal and e(t) is the echo afterprocessing. The first two measures were used tomeasure the performance on mixtures that containedboth far-end and near-end speech together, and the thirdmeasure was only used to compare the level of echoreduction during the pauses in speech. The results of theSID and SDR ratios are shown in Table 1, and theresults of the ERLE ratios are listed in Table 2.a)b)c)d)AmplitudeAmplitudeAmplitudeAmplitude0.40.30.20.10-0.1-0.2-0.3Near-end-0.40 2 4 6 8 10 12 14 16Timex 10 40.50.40.30.20.10-0.1-0.2-0.3Far-end(echo)-0.40 2 4 6 8 10 12 14 16Timex 10 40.50.40.30.20.10-0.1-0.2-0.3Mixture(Echo plus near end)-0.40 2 4 6 8 10 12 14 16Timex 10 40.40.30.20.10-0.1-0.2-0.3Output after Algorithm-0.40 2 4 6 8 10 12 14 16Timex 10 4Figure 1 a) Near-end speaker with pause, b) Far-endsignal (Noise), c) Mixture at near-end microphone, d)output from algorithm.444


6. RESULTS AND DISCUSSIONThe results of the experiments show that this approachleads to a significant reduction in the level of echo. SIRresults in Table 1 show that the algorithm can achieve asignificant reduction in the echo signal duringdoubletalk with speaker specific trained bases. Theresults from using the independently trained basismatrices (Output 2 Table 1) are similar as those of theexperiments using speaker specific trained basis. Thissuggests that this approach is not speaker dependent.The Output 3 listed in Table 1 shows the results whenthe actual reference speech is not known. This result isnot as good as previous experiments. This suggests thathaving the actual far-end reference signal is necessary(far-end speech) for improved overall performance.The results listed in Table 2 show the performance ofthe AEC approach during pauses in the near-end speech.The results overall are quite good. However not all echowas removed because some of the local speaker basismatched some of the echo energy thus retaining it in theoutput.In the subjective listening tests, three volunteers listenedto the input speech and the output speech after ran of thealgorithm. All of the listeners agreed that the echo wasreduced significantly in the output speech. This isimportant for the practical application of the algorithm.Further work on this topic includes combine features ofdifferent versions of NMF or other mathematical tools,find the optimal or best suitable algorithm for differentapplications such as AEC, musical separation, etc.Moreover the number of basic functions R for echo andnear-end speech bases can be optimized. Alternativelysome of the non-linear post-processing techniques usedto improve LMS methods such as component zeroingcould be employed to improve performance [1].Near-end Far-end Input Input Output 1 Output 1 Output 2 Output 2 Output 3 Output 3(echo) SDR dB SIR dB SDR dB SIR dB SDR dB SIR dB SDR dB SIR dBSpeech 1 Speech 2 3.0204 3.0204 9.3880 24.4273 8.8893 22.7629 3.3207 9.8459Speech 2 Speech 3 3.2072 3.2072 6.8427 23.5563 6.1803 21.3245 4.7592 8.7264Speech 3 Speech 4 2.5625 2.5625 8.6977 22.5722 7.4898 18.3213 3.0983 8.1832Speech 4 Speech 1 2.3412 2.3412 5.6212 17.5956 5.2179 17.2864 4.2391 6.7645Average 2.7828 2.7828 7.6374 22.2879 6.9443 19.9237 3.8543 8.3801Table1: Energy ratio results from experiments. Input SDR and SIR are input ratios. Output 1 SDR, SIR resultsfrom speaker dependent bases and output 2 independent bases. Output 3 is the results from not having thereference signal.Near-end Far-endERLE(dB)(echo)Speech 1 Speech 2 10.2991 (see Figure 1d)Speech 2 Speech 3 11.2945Speech 3 Speech 4 10.1103Speech 4 Speech 1 9.2382Average 10.2355Table 2: Calculated ERLE for pauses in near-end speech Experiments545


7. CONCLUSIONSIn this paper we employed a new method calledConvolutive Nonnegative Matrix Factorization andshowed how this method can be used to performacoustic echo cancellation. From the experimentalresults we can see that Acoustic Echo Cancellation canbe reduced using CNMF in a monaural sound sourceseparation framework. The CNMF can achieve goodcancellation results, however combine its feature withsome other techniques may lead to even betterperformance. Further work includes developingalgorithms for faster convergence and improvedperformance in terms of minimizing the objectivefunction.Marshall, “The DARPA Speech Recognition ResearchDatabase: Specifications and Status,” <strong>Proceedings</strong> ofDARPA Workshop on Speech Recognition, pp. 93-99,Feb. 1986[14] E. Vincent; R. Gribonval; C. Fevotte; “PerformanceMeasurement in Blind Audio Source Separation”,IEEE trans on Speech and Audio processing. VolumePP, Issue 99, 2005 Page(s): 1 – 8.[15] T. Virtanen, “Monaural Sound Source Separation byNonnegative Matrix Factorization with TemporalContinuity and Sparseness Criteria”, IEEETransactions on audio, speech and languageprocessing, VOL. 15, NO. 3, March 2007REFERENCES[1] S. Haykin and B. Widrow, Least-mean-squareadaptive filters, Wiley-Interscience, Hoboken, N.J.,2003.[2] Niall Cahill and Robert Lawlor. "A novel Approach toAcoustic Echo Cancellation", Department ofElectronic Engineering, National University of IrelandMaynooth, Eurasip08[3] P. Smaragdis, “Discovering auditory objects throughnonnegativity constraints,” in SAPA, 2004.[4] M. N. Schmidt and R. K. Olsson, “Single-channelspeech separation using sparse non-negative matrixfactorization,” in (INTERSPEECH), 2006.[5] S. T. Roweis, “One microphone source separation,” inNIPS, 2001, pp. 793–799.[6] Local Non-Negative Matrix Factorization as a VisualRepresentation, Tao Feng, Stan Z. Li, Heung-YeungShum, HongJiang Zhang Microsoft Research Asia,Beijing Sigma Center, Beijing 100080, China[7] G. J. Jang and T. W. Lee, “A maximum likelihoodapproach to single channel source separation,” JMLR,vol. 4, pp. 1365–1392, 2003.[8] D.D. Lee and H.S. Seung. “Learning the Parts ofObjects by Nonnegative Matrix Factorization”, inNature 1999 (401):788.[9] D.D. Lee and H.S. Seung, “Algorithms for nonnegativematrix factorization”, in Advances in NeuralInformation Processing Systems 13, 2000.[10] P. Smaragdis, “Convolutive Speech Bases and theirApplication to Supervised Speech Separation”, IEEETrans. on Audio, Speech and Language Processing,Vol. 15, Issue 1, pp. 1-12, January, 2007[11] Roweis, S.T. “One Microphone Source Separation”, inNeural Information Processing Systems 13, 2000.[12] J. B. Allen and D. A. Berkley, “Image method forefficiently simulating small-room acoustics,” J. Acoust.Soc. Amer., vol. 65, no. 4, pp. 943–950, Apr.1979.[13] W.M Fisher, G.R. Doddington, and K. Goudie-646


Using Apodization to improve the performance ofthe Complex Spectral Phase Estimation (CSPE)AlgorithmJian Wang #1 , Joseph Timoney #2 , Matthieu Hodgkinson #3# Computer Science Department,National University of Ireland, Maynooth, Co. Kildare, Ireland1jwang@cs.nuim.ie2jtimoney@cs.nuim.ie3matthew.hodgkinson@nuim.ieAbstract— The recently introduced Complex Spectral PhaseEvolution (CSPE) algorithm is a super-resolution technique forthe estimation of the exact frequency values of sinusoidalcomponents in a signal. However, if a component of the signaldoes not exist within the entire data set, it cannot be identifiedout by CSPE algorithm, even though it still may be visible in theFFT magnitude spectrum. In this paper, we identify the source ofthis problem and propose a novel approach to resolve this issue.Specifically, we will show how to use a window apodizationfunction to improve the CSPE algorithm. Experimental resultsare presented to illustrate the performance enhancement.Keywords— CSPE, Apodization, Kaiser WindowI. INTRODUCTIONMost often the estimation of the frequencies of a signalcomposed of sinusoidal components is done in the frequencydomain using peak-picking from the magnitude spectrum ofthe signal. However, this accuracy of this approach is severelylimited to cases where a component frequency is not amultiple of the windowed signal length divided by thesampling frequency. In essence, this means only when acomponent frequency is aligned exactly with the analysisfrequencies of the DFT, can it be measured accurately. Whenthe component frequency does not satisfy this constraint, acommon solution that is used in Sinusoidal Modellingalgorithms is to apply quadratic interpolation to thecomponent spectral magnitudes immediately either side of thetrue frequency to find the correct frequency and magnitudevalues. However, the performance of this method is highlydependent on the window function used [1] and the length ofthe data for analysis. The CSPE algorithm was introduced by[2] as a method to accurately estimate the frequency ofcomponents that exist within a short time frame. It was alsodesigned to be computationally efficient. It is actually relatedin some aspects to the cross-spectrogram technique of [3].However, the CSPE algorithm has been found to be unableto detect frequency components that do not appear throughoutthe entire signal source under analysis. This is puzzlingbecause an associated peak can still appear for the componentin the FFT magnitude spectrum. To resolve this issue it isnecessary to investigate the CSPE algorithm in more detailand determine how it can be improved.This paper is organized as follows: Firstly, we give ageneral introduction to the CSPE algorithm followed by anexperimental evaluation that compares the CSPE algorithmwith the widely-used frequency estimation method introducedin [4]. Then, we will explain in more detail the problem ofidentifying components that do not exist for the complete dataframe and introduce the idea of apodization to solve it. Lastly,we will show the improvement to the CSPE result by usingthe apodization function by providing some experimentalresults.II. CSPE AND ITS COMPARISON WITH ANOTHERFREQUENCY ESTIMATION APPROACHThe principal of CSPE algorithm can be described as below:An FFT analysis is performed twice; firstly on the signal ofinterest, and the second time upon the same signal but shiftedin time by one sample. Then, by multiplying the sampleshiftedFFT spectrum with the complex conjugate of the initialFFT spectrum, a frequency dependent function is formed fromwhich the exact values of the frequency components itcontains can be detected. This frequency dependent functionhas a staircase-like appearance where the flat parts of thegraph indicate where the exact frequencies of the components.The width of the flat parts is dependent on the main-lobewidth of window function used to select the signal before FFTprocessing. Mathematically, the algorithm can be described asfollows:Assuming a real signal s , and a one-sample shifted version of0this signal s . Say that its frequency is1β = q + δ where q isan integer and δ is a fractional number. If b is an initial phase,w is the window function used in the FFT, F isnwindowed Fourier transform of s0, andFourier transform of s1, then first writingw s 0Fws 1is the windowed47


Dej 2 πβN= (1)The frequency dependent CSPE function can be written ass0s1CSPE = Fwws 0F*ws1⎡ * n2D Fw( D )⎢a 2= ( ) ⎢+2Re2 ⎢⎢−n+ D F ( )⎢w D⎣⎤⎥⎥⎥⎥⎥⎦j2bn * −n{ e DF ( D ) ⊗ F ( D )}w2w(2)WindowingFFTWindowingFFTThe windowed transform requires multiplication of the timedomain data by the analysis widow, and thus the resultingtransform is the convolution of the transform of the windowfunction, w , with the transform of a complex sinusoid.fSince the transform of a complex sinusoid is nothing but apair of delta functions in the positive and negative frequencypositions, the result of the convolution is merely a frequencytranslatedcopy of w f centred at + β and − β .Consequently, with a standard windowing function, thenF ( D ) term is only considerable when k ≈ β , and itwdecays rapidly when k is far from β . Therefore, the analysiswindow must be chosen carefully so that it decays rapidly tominimize any spectral leakage into adjacent bins. If this is soit will render the interference terms, i.e. the second and thirdterms, to be negligible in (2). Thus, the CSPE for the positivefrequencies gives:2an2−1CSPEw≈ Fw( D ) D(3)4Finding the angle of (2) leads to the CSPE frequency estimate− N∠(CSPEw)fCSPE w==2π2⎛ an2−1⎞− N∠⎜ Fw( D ) D⎟(4)⎝ 4⎠=2π2a− N∠(4F ( Dw2πn)2e2π− j βN2π) − N ( − β )=N= β2πThe procedure of the CSPE algorithm is depicted in blockdiagram form in Figure 1.Fig. 1 The flow diagram of CSPEAn example of the output of the CSPE algorithm is shown inFigure 2. Consider the signal S1which contains componentswith frequency values (in Hz) of 17, 293.5, 313.9, 204.6,153.7, 378 and 423. The sampling frequency is 1024 HZ. Aframe of 1024 samples in length is windowed using aBlackman window and is padded using 1024 zeros. Thefrequency dependent CSPE function is computed as per eq.(2). As shown in Figure 2, each component can be identifiedexactly and are labelled with an arrow in the graph. Thelargest error among all the estimates of the componentsfrequencies is approximately 0.15 Hz.Frequency value500450400350300250200150100Multiply bin by binAngleFrequencyEstimationFrequency estimation by improved CSPEConjugate5000 100 200 300 400 500 600 700 800 900 1000bin indexFig. 2 Frequency estimation of S1 by CSPE48


Notice too in Figure 2 that at the flat sections in the graph ofthe CSPE result, the width of flat sections where the arrowspoint are related to the width of the window’s main-lobe in thefrequency domain.2.1 Accuracy of the CSPE algorithmAn experiment was carried out to compare the accuracy ofthe Quadratic Interpolation Estimation Algorithm [4] with theCSPE algorithm. The procedure of this experiment can bedescribed as below: defining twenty centre frequenciesFsfc i(0< fc i< ) , for each , M random frequencies2fciFrequency value500450400350300250200150100Frequency estimation by improved CSPEwere generated (each of which has a small random fluctuation of fc i)and thereafter M signals were created based on theseM frequencies. The RMS error of the frequency estimationby CSPE and Quadratic Interpolation Estimation Algorithmfor these M signals were calculated respectively for eachf which shown in the figure below:c iPolynomial fit error RMS3.63.53.43.33.23.132.92.80.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2Centre frequency (Hz)x 10 45000 100 200 300 400 500 600 700 800 900 1000bin indexFig. 4 Frequency estimation of S2 by CSPEFrome figure 4, it can be seen that there is no flat region inany part of the graph, that is, none of frequency componentscan be identified by CSPE algorithm . However, if the FFTmagnitude Spectrum of S2is plotted, as shown in figure 5,each frequency component is still visible which indicates thatthere should be some information related to the componentpresent in any FFT-based frequency domain analysis. So, thenext section will try to understand this problem and propose anovel approach to deal with it.12x 10 -3120FFT SpectrumCSPE error RMS108642100800.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2Centre frequency (Hz)Fig. 3 Accuracy comparison of quadratic fit and CSPE frequency refinementsAs shown in figure 3, the CSPE estimate was found to bemore accurate than the quadratic interpolation approach, over3a factor of 10 in many cases.x 10 4Magnitude value604020From the above experiment, it is clear that the CSPEalgorithm works very well when the components contained insignal are constant and stable for over the entire data length.However, there can be cases where some components willonly appear in half or even a quarter of the data frame length.We can run another experiment on the signal S2that has thesame frequency components as S1, but restricting eachcomponent to appear in half or a quarter of the frame. Theresulting output of the CSPE algorithm is shown in Figure 4:00 100 200 300 400 500 600 700 800 900 1000Fig. 5 FFT spectrum of the signalbin indexIII. ANALYSIS OF THE PROBLEM AND AN IMPROVEMENT ONTHE CSPE ALGORITHMLet’s suppose there are three signals: x 1[n ] , x [ n]2 ,x3[n]with the same length 1024 samples, with the sameSampling Frequency 1024 HZ, and all bearing the samefrequency component at 123.5 HZ. The difference among thethree signals being that the component appears over the entire49


length of x 1[n ], while it only appears in a half and quarterlength of signals x2[n], x3[n]respectively, the remainingvalues sample being zero. If we do a normal FFT analysis thiscomponent will not be centred on a frequency bin and insteadwill produce a representation with significant peaks at the124 th and 125 th bins with smaller components dying awayeither side of them. Thus, this signal is an ideal candidate for aCSPE algorithm analysis.It is possible to rewrite x [ ] and x [ ] in terms of the2 n3nproduct of x1[n ] and a step function. If u [ n 1] and u [ n]2are two different unit step functions then,x2[ n]= x1[n]u1[n];(5)x n]= x [ n]u [ ];(6)3[ 1 2n( x 1Denoting F ) and F (w)as the FFT transform of x1[n]n , such as a Blackmanw ksand a suitable window function [ ]window, the spectral representation of the signal x 1[n ] canbe written asF x ) = F(x ) * F()w (7)(1 1wwhere * denotes the convolution operator.Then, the spectral descriptions of the other signals can bewritten asF(x2w) = F(xAnd likewise,) * F(w)= F(x ) * F(u ) * F(w)121F(x3w)= F(x3) * F(w)(9)= F(x1) * F(u2) * F(w)Examining equations (8) and (9) it is possible to interpret theterms F ( u 1) * F(w)and F ( u 2) * F(w)as the actualwindowing operation that are applied to the signal x 1[n ] inthe frequency domain. Now if we compare the original andalternative window functions frequency response that are alleffectively applied to x1[n ] we can see that there is animportant difference between the original window and theothers in terms of the main lobe size and the height of the sidelobes. These are shown in Figure 6.(8)Magnitude in dB6040200-20-40-60-80-100-120Magnitude response of signals-1400 100 200 300 400 500 600 700 800 900 1000bin indexFig. 6 Magnitude Response of three different actual window functionsFrom figure 6, when the signal doesn't appear over the entireframe, its actual window function spectrum is significantlydifferent from the original window function. Specifically, theside-lobe hasn’t been suppressed in a large extent and thewidth of main-lobe has been increased. This impacts theCSPE algorithm then in that the interference terms outline ineq. (2) are not sufficiently suppressed. Thus, because theseterms are larger, the CSPE output is useless for finding theexact signal frequency of x 2[n ] and x [ n]3. Motivated by theidea of introducing window function in the first place Fourieranalysis, we can introduce a second window function tosuppress the greater side-lobes caused by the convolutioneffect of the spectrum with the unit step functions. Thispractice is known as apodization [6]. It is more commonlyknown in image processing than in 1-D signal processing.Normally, an Apodization Function is used to suppress theeffects of side-lobes at the expense of lowering the spectralresolution. Some researchers, particularly in image processing[7], [8] have shown that the Kaiser window Function is abetter for apodization than other window functions such as thePoisson, Gaussian or Tukey. The Apodization factor using theKaiser Function can be written as[ n] = 1−kaiser(N,β)fw1fw2fw3w ks(10)The side lobe suppression of the Kaiser window is dependenton the parameter β . The apodization of the signal analysiswindow is then given byα( )[ n] w[ n] w [ n]wA =ks(11)Experimentally the relationship between different values ofβ and effect of raising w ks[ n]to an integer power α, toenhance the suppression effects, was evaluated. An exampleof the effects of suppression of the side-lobes is depicted inw ksn isFigure 7. From the figure 7, we can find when [ ]50


aised to a cubic power, α =3, with β = 0.01, it has a sidelobeattenuation level greater than 300 dB.100Magnitude response of signalsbeta = 1.9, times = 1beta = 0.7, times = 1beta = 0.01, times = 3TABLE ICONFIGURATION OF β AND T FOR THE DETECTION OFDIFFERENT PROPORTION OF A FRAMEMagnitude in dB0-100-200-300-400-500-600The Proportion of one Frame121418116β =0.01, α = 1 Y N N Nβ =0.01, α = 3 Y Y N Nβ =0.01, α = 10 Y Y Y Nβ =0.01, α = 18 Y Y Y Y-7000 50 100 150 200 250bin indexFig. 7 Magnitude Response of Kaiser window functionNext, the signal S2can be analysed with the functionw A[ n](beta = 0.01, α = 3) and the CSPE frequency detectionresult is shown in Figure 8 where the arrows label the detectedfrequency components. It can be seen that now the frequencycomponents are identified in the CSPE function.500450400350Frequency estimation by improved CSPEIV. CONCLUSIONS AND FUTURE WORKThis paper has addressed a problem discovered with theCSPE algorithm, that is, when frequency component does notexist throughout the entire length of the data frame, thatalthough it appears in the FFT magnitude spectrum, the CSPEalgorithm is not capable of detecting this component. Byfocusing on changing the analysis window’s frequencyresponse, the idea of Apodization Function was introducedthat was shown to over this difficulty. An experimental resulthas demonstrated that the performance of the CSPE algorithmhas been improved by applying this solution. In future, theintention is to try to extend this CPSE algorithm to correctlyidentify the dynamic frequency evolution of a frequencymodulated signal.Frequency value300250200150100REFERENCES[1] Florian Keiler, Sylvain Marchand, “Survey on extraction of sinusoidsin stationary sounds”, Proc. Of the 5 th Int. Conference on DigitalAudio Effects(DAFX-02), Hamburg, Germany, September,20025000 100 200 300 400 500 600 700 800 900 1000bin indexFig. 8 Frequency estimation by improved CSPEIt was determined experimentally then that when componentsexist at a different proportion over a frame, the value of βand the power of [ n]w kshave to be adapted to get asatisfactory result. Table 1 summarizes this configuration as areference for users, where Y means a component can bedetected, while N means that it cannot; and α means thepower of the Kaiser window.[2] Short ET AL, “Method and apparatus for compressed chaotic musicsynthesis”, United States Patent 6,137,045; October 2000[3] Douglas Nelson, Cross Spectral Methods for Processing Speech,Journal of the Acoustic Society of America, vol. 110, No.5, pt.1,Nov.2001, pp.2575-2592[4] J. Rauhala, H.-M. Lehtonen and V. Välimäki, “Fast automaticinharmonicity estimation algorithm”, Journal of the Acoustical Societyof America, vol. 121, no. 5, pp. EL184-EL189, 2007.[5] B. Frei, ‘Digital_Sound_Generation – Part 1 Oscillators’, Online Book,Institute for Computer Music and Sound Technology, ZurichUniversity of the Arts, Switzerland.. 2007.[6] H. C. Stankwitz, R. J. Dallaire, and J. R. Fienup, “Non-linearApodization for Sidelobe Control in SAR Imagery”, IEEE Trans. OnAerospace and Elect. Syst., Vol. 31, No. 1, pp. 267-279, Jan. 1995[7] Thomas, G., Flores, B.C., Jae Sok-Son, “SAR sidelobe apodizationusing the Kaiser Window”, Image Processing, 2000. <strong>Proceedings</strong>.51


Computing Modified Bessel functions with largeModulation Index for Sound Synthesis ApplicationsJoseph Timoney, Thomas LysaghtDept. of Computer ScienceNUI Maynooth, Co. Kildare, Ireland.jtimoney@cs.nuim.ieVictor Lazzarini,Dept. of Music,NUI Maynooth, Co. Kildare, Ireland.victor.lazzarini@nuim.ieRuiyao Gao,Dept. of Electronic. Eng.,ITT Tallaght, Dublin, Ireland.rgao@ittdublin.ieAbstract-Ordinary Bessel functions are a common functionused when examining the spectral properties of frequencymodulated signals, particularly in sound synthesis applications.Recently, it was shown that modified Bessel functions can alsobe used for sound synthesis. However, to limit the impact ofaliasing distortion when using these functions, it is essential toset an upper limit on the frequency-dependent modulationindex used when computing these functions. However, it canbe impossible to do this beyond a certain threshold when usingstandard mathematical software tools such as Matlab, or thescientific toolbox of the Python language, because of numericaloverflow issues. This short paper presents an approach toovercome this limitation using the MaxStar algorithm. Resultsare also presented to demonstrate the usefulness of thissolution.Keywords: Modified Bessel functions, numerical overflow,Maxstar algorithm.I. INTRODUCTIONFrequency Modulated (FM) signals are important in boththe fields of telecommunications and sound synthesis.Ordinary Bessel functions are a key mathematical tool forthe understanding of the spectral properties of these FMsignals [1]. The success of FM synthesis as a soundgenerating technique led to the exploration of othertechniques similar in concept, specifically using ModifiedBessel functions [2]. However, for a long period this workwas forgotten until recently when it was shown that asynthesis technique based on Modified Bessel functions wasvery useful for the generation of high quality, low-aliasingdigital reproductions of the periodic waveforms used inanalog subtractive synthesizers [3], for example, sawtoothwaves. Specifically, the synthesis equation issmcos( ωmt)( t) = e cos( ω t)where m is the Modulation index.This can be expressed using Modified Bessel functions as( m) + I ( m) ( cos( ω t − nωt) + ( ω t + nωt))∑ ∞n=1I0 nc mcoswherecc(1)(2)m(.)Inis a modified Bessel function of order n andcmω andω are the carrier and modulation frequencies respectively[3].From (2), it can be seen that equation (1) generates aharmonic signal with a spacing of frequency ω andmagnitude scaling given by the set of Modified BesselI nm where n = 0,K,∞ .functions ( )In practical applications (1) should be scaled by a factor[3]which leads togs−m( m) e= (3)( m cos( ωmt) −m) ( t) = e cos( ω t)If the ratio between the carrier and modulationfrequencies is one, then Equation (5) describes a unipolarpulse train. The width, and thus the smoothness of the pulse,is determined by the value of the modulation index m.Amplitude0.90.80.70.60.50.40.30.20.10Pulse Width for various values of mcm(4)m=2m=6m=10m=140.014 0.015 0.016 0.017 0.018 0.019 0.02 0.021 0.022 0.023Time (S)Figure 1. Plot of the pulseshape defined by (5) for various values ofmodulation index.Lower values of m give a broader pulse shape. Figure 1shows an example of this for values of m ranging from 2 to14. For this plot the sampling rate was set to be 8 kHz andωc= ωm= 55Hz.52


The spectrum of this pulse train is given by [3]1Example of sawtooth of frequency 440Hz1e( ω) = ( I ( m) + I ( m)) ( nω)∑ ∞ mn=1X (5)n−n+11cosThe harmonic amplitudes of the expression for thespectrum in (5) are determined by the modified Besselfunctions In(.)that are scaled by the factor g(m). Thisfactor gives a smooth low pass characteristic to thespectrum with the steepness of the roll-off being determinedby the value of m. Figure 2 provides a spectral example of(5) for a carrier frequency of 100Hz.Magnitude10.90.80.70.60.50.40.30.2Example Spectrum of (5) for carrier frequency 100Hz0.10 100 200 300 400 500 600 700 800Frequency (Hz)Figure 2. Plot of tan example of (5) showing the lowpass characteristicof the harmonic amplitudes.The implications of this for the digital generation of theclassic waveforms of subtractive synthesis is that byintegrating this pulse train it is possible to create a signalspectrum that is an approximation to that of a sawtoothwave, whose harmonic magnitudes decrease with respect tothe harmonic number. The equation for the integratedspectrum isAmplitude0.80.60.40.20-0.2-0.4-0.6-0.8-10 0.002 0.004 0.006 0.008 0.01Time (Sec)Figure 3. Sawtooth wave generated by integrating (4), followed by DCblocking, and whose spectrum is given by (6).II. OPTIMISING BANDLIMITED SIGNAL SYNTHESISWhat is of primary concern when creating thisapproximation is that the signal should be effectivelybandlimited, that is, any aliased components should be ofsufficiently low magnitude so as to be imperceptible. Toensure this, it is then a question of choosing a suitable valuefor m such that any harmonics that exist in theory beyondhalf the sampling frequency are sufficiently small such thattheir aliased version will not be heard. This can be posed asan optimsation problem, as given by in (7). Here, a figure of-90dB is chosen as the upper threshold on the spectralmagnitude of the aliased components [3].⎧max⎨20logm⎩′( m)( N + 1)I ′( m)⎫−1I N + 110 ⎬ ≤ −1where N is the number of harmonics in the sawtooth wavefrom DC to half the sampling frequency andI⎭90( m) = I ( m) I ( m)n n− 1+n+1(7)′ (8)−m∑ ∞ω n=1( ω) = ( I ( m) + I ( m)) ( nω)′enX (6)n−1 n+1sinIf the integrated signal is then passed through a suitable DCblocker filter as described in [4], then the output waveshapeshould be close to that of a sawtooth, as illustrated in Figure3. The example in Figure 3 was generated for carrier andmodulator frequencies of 440 Hz and a sampling frequencyof 44100 Hz. The modulation index was chosen to be 943and was determined empirically. It is clear from Figure 3that the waveshape is that of a sawtooth, validating theusefulness of this technique for the application.To perform this optimization it is possible to use astandard routine such as ‘fmin’, for example, that isavailable as a routine in the Matlab software package. Thisroutine uses a Nelder-Mead Simplex search method [5].However, in the implementation of the optimization of (7) aproblem was discovered. Specifically, when attempting tocompute the magnitude of the Modified Bessel function forvalues of modulation index greater than 700 the algorithmused generates a numerical overflow and will return a valueof infinity. A similar behaviour was observed when usingthe Sci.py module of python for the computation in [6].A. Computing Modified Bessel Functions using LogsTo compute a Modified Bessel function of order n andmodulation index m it is possible to use the formula [7]I( n,m)=∑ ∞k=0n+2k( m 2)k!( n + k )!(9)53


From (9) both the numerator and denominator of thesummation will grow to be infinitely large, but empiricalobservation found that the value of their ratio will first reacha maximum and then decrease to zero as m increases. In animplementation it can be surmised that the maximumnumber of terms in the summation can be restricted as longas it exceeds the point where the ratio reaches zero. This is avalid approach but fails when the maximum of the ratio isbeyond the numerical precision of the machine. As statedalready, this occurs for large values of modulation indexand thus an alternative is required in such cases. An obviouschoice for compressing the numerical values generated byeach term of (9) is to use the logarithmic function⎛log⎜⎝n+2k( m 2)k!( n + k)⎞⎟ =!⎠and using the logarithmic property( n + 2k) log( m 2) − log( k!( n + k)!)(10)log x z = z log x(11)Equation (11) can then be rewritten using the multiplicativeproperty given in (12)to give⎛log⎜⎝n+2k( m 2)k!( n + k )( xz) log x log zlog +⎞⎟ =!⎠= (12)( n + 2k) log( m 2) − log( k!)− log( ( n + k )!)(13)A number of possibilities exist for expressing the logarithmof a factorial. Firstly, the exact expression is [8].logx( x!) = ∑z=1log z(14)Alternatively, a very good approximation for the logarithmof a factorial due to Ramaujan [8], for x ≠ 0 , can be writtenwhich would reduce the computational effort in evaluatingthe multiplications in ( !)log( x!)log≈ xlogx − x +log x for each term.( x( 1+4x( 1+2x))) log( π)6+2(15)Furthermore, it also is more robust numerically. Particularly,in the case of Matlab [5], if the number of terms k selectedexceeds 170, then ( !) = ∞log k .Using (15), equation (13) can be rewritten to produce=( n + 2k) log( m 2)−−log− k log k + k −( n + k) log( n + k) + ( n + k )log( ( n + k) ( 1+4( n + k) ( 1+2( n + k))))6( k( 1+4k( 1+2k)))6⎛ π ⎞+ log⎜⎟⎝ 2 ⎠(16)B. Applying the MaxStar algorithmFrom (16) it can be seen that each term in (10) can becomputed using logarithms which will significantly reducethe size of its numerical value, thus avoiding overflowproblems. However, the next issue is how to add themwithout computing the exponential of each term. Theyshould also be added in the log domain and then theexponential found of the overall result. To this end a veryuseful algorithm from the field of telecommunications is theMaxStar algorithm [9]lnx z− x−z( e + e ) = { max( x,z) + ln( 1+e )}wheremax( x,z)= x= zif x ≥ zotherwise(17)This expression is applied iteratively to the summation untilthe final term is reached [10]. An alternative approximationwas recently given in [10], which waslnx z x + z ⎛ ⎛ x − z ⎞⎞( e + e ) = + ln⎜2cosh⎜⎟⎟ ⎠2⎝⎝2⎠(18)Once the sum of terms is found using (17) or (18) all thatremains to compute (9) is to find the exponential of this sum.This procedure will work in Matlab [5] as long as the totalvalue of the log of the sum of the terms in (9) is 709 or lessas otherwise an infinite output will result because in thispackage710e = ∞(19)This could be problematic in the general case, but if all wewant to do is solve the optimization problem of (7) then thelogarithmic can also be rewritten as−1( ′ ( m)( N + 1)) − ( I ( m))log10I N 1log 10 1′+54


−1( + 1) + log ( I′( m)) − ( I ( m))=10N10 N + 1log101log ′(20)Ignoring the first term which is a constant andsubstituting (8) into (20) gives( I ( m) + I ( m)) − ( I ( m) I ( m))= log10 N N + 2log100+2(21)Both terms in (21) are structurally similar, so, forexample, considering the first term only and defining theoutput of the MaxStar algorithm as MS (.)leads to thedefinitionThenlog10= logMS( I ( m))n⎛= ⎜∑ ∞ log⎝ k=0n+2k( m 2)⎞k!( n + k )!⎟⎟ ⎠(22)( IN( m) + IN + 2( m))( exp( MS( I ( m))) + exp( MS( I ( m))))10NN + 2(23)xwhere for clarity the function exp(x) is used to represent e .Applying the MaxStar algorithm to (23) produces the nestedexpression in (24)( ( MS( MS( I ( m)) + MS( I ( m)))))= log10exp+N N 2It is possible then to apply the propertywhich results in= MSyz( x ) z log xy(24)log = (25)( MS( I ( m)) + MS( I ( m))) e+2logN N10(26)Thus, the exponential power has been removed and now ispresent in (26) in the form of a multiplication by a constant.In this form any issues with numerical overflow should beovercome. The optimization in (7) can finally be rewrittenusing the formulation in (26) as⎧log⎪max⎨MSm⎪⎩− MS−110( N + 1)+( MS( IN( m)) + MS( IN + 2( m)))( MS( I ( m)) + MS( I ( m)))02⎫⎪ 90log10e⎬≤ −20log ⎪10e⎭(27)For example, if we have a pitch frequency for the sawtoothwave of 146.8324 Hz (note D) and a sampling rate of44100 Hz, the number of harmonics that will exist to halfthe sampling frequency is N=150. Setting the maximumnumber of terms in the summation to be 2000, theoptimization routine returns a value of modulation indexm=2131.7. Figure 4 is a plot of the lower portion of thespectrum of this sawtooth, after hanning windowing,showing its harmonics with no visible alias componentspresent.Magnitude10.90.80.70.60.50.40.30.20.1Portion of the Spectrum of Sawtooth with Optimum Modulation Index0100 200 300 400 500 600 700 800 900 1000 1100Frequency (Hz)Figure 4. Lower part of spectrum of optimized sawtooth.III. CONCLUSIONThis short paper has presented an approach for thecomputation of Modified Bessel functions with highmodulation indices that uses the MaxStar algorithm toovercome numerical difficulties currently experienced withmathematical software packages. Furthermore, it also hasshown how an optimisation formulation can be rewrittenusing the MaxStar algorithm that obviates the need toexplicitly compute large numbers that are raised to anexponential power.Future work will seek out other similarapplications where the MaxStar algorithm would prove tobe useful.IV. ACKNOWLEDGMENTVictor Lazzarini would like to acknowledge the fundingsupport given by An Foras Feasa for this work.REFERENCES[1] J. Chowning, and D. Bristow, FM Theory & Applications - Bymusicians for musicians, Yamaha, Tokyo, 1986.[2] J.A. Moorer, ,“The Synthesis of Complex Audio Spectra by Means ofDiscrete Summation Formulas”. Journal of the Audio EngineeringSociety, 24 (9), 1976.[3] V. Lazzarini, J. Timoney and T. Lysaght, “A Modified FM synthesisapproach to bandlimited signal generation”, Proc. 11 th conf. DigitalAudio Effects (DAFx), Espoo, Finland, Sept. 1-4, 2008.[4] R. Yates and R. Lyons, ‘DSP Tips & Tricks [DC BlockerAlgorithms]’, IEEE Signal Processing Magazine, vol. 25, no. 2, pp.132 – 134, March 2008.[5] Matlab 5.3, The Mathworks: 1999.[6] Sci.py, Python Module: <strong>2009</strong>. http://www.scipy.org/[7] G. N. Watson, A Treatise on the Theory of Bessel Functions, 2nd Ed.,Cambridge Univ. Press, Cambridge, UK, 1944.[8] Factorial, Wikipedia entry, <strong>2009</strong>:http://en.wikipedia.org/wiki/Factorial.[9] J. A. Erfanian, and S. Pasupathy, “Low-Complexity Parallel-StructureSymbol-by-Symbol Detection for IS1 Channels,” IEEE Pacific RimConference on Communications, Computers and Signal Processing,June 1st - 2nd, 1989.[10] Motorola Inc, Apparatus and method for calculating the logarithm ofa sum of exponentials, European Patent Office, EP20020293271,June 2004.55


Section 2ARADIO SYSTEMS 156


Wireless Billboard Channels over T-DMBZhanlin Ji, Ivan Ganchev, Máirtín O'DromaTelecommunications Research Centre, University of Limerick, Ireland{Zhanlin.Ji; Ivan.Ganchev; Mairtin.ODroma}@ul.ieAbstract—This paper describes wireless billboard channels(WBCs) established over terrestrial digital multimediabroadcasting (T-DMB), which are used by the service providersto broadcast advertisements of their services to mobile users sothey may discover and associate with the ‘best’ service followingthe user-driven ‘always best connected and best served’ principle(ABC&S) in the emerging ubiquitous consumer wireless world(UCWW). A novel and more efficient IP datacasting (IPDC)operational mode for ‘WBC over T-DMB’ is proposed andevaluated.Keywords-Ubiquitous Consumer Wireless World (UCWW);Wireless Billboard Channels (WBCs); Advertisement, Discoveryand Association (ADA); Terrestrial Digital MultimediaBroadcasting (T-DMB); IP Datacasting (IPDC).I. INTRODUCTIONThe WBCs are defined as simplex, unidirectional,narrowband and broadcast channels [1], which are solely usedto ‘push’ (wireless) service advertisements to a large number ofmobile users (MUs). WBCs are fundamental part of theubiquitous consumer wireless world (UCWW) – the Universityof Limerick’s next generation network (NGN) proposal [2~4].The UCWW will bring many benefits to MUs, teleserviceproviders (TSPs), and access network providers (ANPs)including: greatly enhanced user freedom of choice inaccessing the services, elimination of roaming charges, ‘levelplaying field’ for new ANPs and much greater commercialopenness and fairness in the ANP market, realization of trulyuser-driven always best connected and best served (ABC&S)[5], user-driven integrated heterogeneous networking, etc.The following technologies are suitable carrier candidatesfor WBCs: the digital audio broadcasting (DAB), terrestrialdigital multimedia broadcasting (T-DMB), digital radiomondiale (DRM), digital video broadcast - handheld (DVB-H),and multimedia broadcast / multicast service (MBMS). Amongthese, T-DMB is a new multimedia broadcasting techniquedeveloped by Korea in 2005 based on the European DABstandard. Comparing with DAB, T-DMB adds a Reed-Solomon (RS) forward error correction (FEC) to improvecommunication in wireless channels and uses a highly efficientMPEG4 part 3 bit sliced arithmetic coding (BSAC) oradvanced audio codec (HE AAC) to replace the DAB’s MPEGAudio Layer 2 (MP2) audio codec scheme [6-8]. To improvespectrum efficiency, T-DMB uses orthogonal frequencydivision multiplexing (OFDM) modulation scheme, andsupports a single frequency network (SFN). T-DMB suits wellas a carrier for WBCs.The ‘WBC over T-DMB’ system is developed along threelayers: a service layer, a link layer, and a physical layer.The error protection of service advertisements is animportant issue in WBCs, since even a single bit error maycause discarding a full WBC segment [9]. In the ‘WBC over T-DMB’ system, the physical layer reliability is ensured by aReed-Solomon (RS) code, convolutional interleaver, puncturedconvolutional code, and inner interleaver. The service layerreliability is ensured by an advertisement delivery protocol(ADP) [10]. For extra error protection, a DVB-H compatiblelink layer is proposed in this paper to smooth IP datacasting(IPDC ).There are four T-DMB transmission modes defined byETSI 300-401 [7], respectively with 1536, 384, 192, and 768subcarriers in one OFDM symbol. This paper focuses onWBCs established over T-DMB operating in all fourtransmission mode.The rest of the paper is organized as follows. Section IIpresents the new ‘WBC over T-DMB’ IPDC mode. Section IIIdescribes the ‘WBC over T-DMB’ software testbed and theobtained performance evaluation results. Section IVsummarizes the conclusions.II. ‘WBC OVER T-DMB’ IPDC MODEThe T-DMB standard [7, 8] supports two operationalmodes: a stream mode – for broadcasting audio stream datasets,and a packet mode – for IP packets broadcasting. However, thestream mode is not suitable for WBCs, whereas the packetmode does not provide sufficient error protection for WBCdata broadcasting in wireless fading channels. Thus, a digitalvideo broadcasting - handheld (DVB-H) compatible module[11] was added on the top of the T-DMB as the link layeroperating in a new ‘WBC over T-DMB’ IP datacasting (IPDC)mode (Figure 1). With the strong outer layer of a FEC scheme,i.e., multi-protocol encapsulation - forward error correction(MPE-FEC) solution, the reliability of ‘WBC over T-DMB’can be improved without changing the T-DMB standard.The MPE-FEC frame is the core element in the link layer.MPE-FEC is introduced in DVB-H for compensating theperformance degradations due to the use of wireless fadingchannels. It is a frame which consists of 255 columns and anumber of rows (256, 512, 768 and 1024 rows are supported[11]). Every cell in the MPE-FEC frame is one byte. A MPE-FEC frame carries a number of WBC segments. The 255columns are divided into two parts: from 1 st to 191 st column isthe application data table (ADT), and from 192 nd to 255 thcolumn is the RS data table (RSDT). Flexibility is allowed byThis publication has been supported by the Irish Research Council forScience, Engineering and Technology (IRCSET), Science Foundation Irelandand the Telecommunications Research Centre, University of Limerick, Ireland(http://www.ece.ul.ie/trc).57


puncturing some parity RSDT columns to achieve differentcode rates, i.e., 1/2, 2/3, 3/4, 5/6, 7/8.IP packets'WBC over T-DMB'IPDC ModeT-DMBStream ModeT-DMBPacket ModeBuild MPE-FEC frame. Generate ST table and insert itat the end of IP section of MPE-FEC frameWBC-ADPHigher LayerLLC SubLayerFinishInstall IP packets into MPE-FECframe. Add RS parity data.FinishIP Packets CacheSegments Index TableMPE-FECTSBSAC/HE AAC/MPEG-4AVC/H.264SourceCodingIP PacketMSCData Group(DG)Packet ModeExtract IP packet from IP sectionof MPE-FEC frame, insert MPEheader and CRC-32 trailer, andput resultant MPE packet incache.Done for all IPpackets?NOExtract RS column from FECsection, insert MPE header andCRC-32 trailer, and putresultant MPE packet in cache.NODone for all RScolumns?FICChannel CoderMSC MultiplexerYESBuild TS packets and put theminto TS buffer table (when tableis full, notify the physical layer)Done for all waitingWBC segments?YESNOOFDMTransmitterTransmission MultiplexerT-DMB SignalYESMAC SubLayerENDFigure 2. The ‘WBC over T-DMB’ link layer encoder’s functional model.Figure 1.The new ‘WBC over T-DMB’ IPDC mode vs. standard T-DMBstream mode and packet mode.1Input TS PacketsRS Encoder(204 ,188 ) ShortenedRS EncoderOuter byteInterleaverI=12Rate 1/2PuncturedConvolutionalCodeInner bitInterleaverIII.‘WBC OVER T-DMB’ SOFTWARE TESTBEDOFDMModulatorFrequencyInterleaverQPSKMapperMUXA. Layers ImplementionThe WBC service layer has a three-tier softwarearchitecture [12]. Thanks to this architecture, the Internet-basedenterprise application on the WBC-SP node and the portabledevice-basedapplication on the mobile terminal are able to runin an intelligent, flexible, and extensible way.The ‘WBC over T-DMB’ link layer software testbed wasdesigned and implemented in C++. The encoder’s functionalmodel is depicted in Figure 2. The decoder running on themobile terminal uses a reversed functional model.The ‘WBC over T-DMB’ physical layer testbed wasdesigned based on the ETSI-EN-300-401 [7] and ETSI-EN-102-427 [8] standards. The transmitter side first generates a3008B 1 dataset, which is further processed as shown in Figure3. In the outer coding section, a Galois field (GF) array iscreated with the dataset. Then the array is encoded by aRS(204,188) module for extra error protection at the physicallayer. In the outer interleaving section, the output matrix is firstreshaped to one-row dataset; then a convolutional interleavingalgorithm is used for permuting the one-row dataset with thehelp of an internal shift register algorithm; finally, the output istranslated into a bit array by a byte-to-bit convertor. In theinner coder section, the bit array is first encoded by theconvolutional encoder and then punctured by the predefined1 3008 = 16 × 188 (transport stream packet length).RayleighFadingAWGNOutput TS Packets2OFDMDemodulatorRS Decoder(204 ,188 ) ShortenedRS DecoderFrequencyDeinterleaverOuter byteDeinterleaverI=1/2QPSKDemapperViterbiDecoderDeMUXInner bitDeinterleaverFigure 3. The ‘WBC over T-DMB’ physical layer testbed.convolutional code rate. In the inner interleaving section, aninner bit-wise interleaving algorithm is used to provide extrareliability. The output of the inner interleaving dataset acts as amain service channel (MSC), which together with a randomdataset fast information channel (FIC) and a Sync channelproduces a T-DMB transmission bit frame in the MUX module.In the mapping section, the T-DMB transmission bit frameis QPSK modulated. Then the output is processed by afrequency interleaving algorithm to improve reliability. Theoutput is modulated by an inverse fast Fourier transform (IFFT)algorithm in the OFDM modulation section. Then a T-DMBOFDM symbol transmission frame is built, which is passed58


10 0 SNR[dB]10 -1IPDC mode IPacket mode IIPDC mode IIPacket mode IIIPDC mode IIIPacket mode IIIIPDC mode IVPacket mode IV10 -1IPERIPER10 -210 -34 5 6 7 8 9 10 11 12 1310 0 IPDC mode I10 -2 Packet mode IIPDC mode IIPacket mode IIIPDC mode IIIPacket mode IIIIPDC mode IVPacket mode IV10 -34 5 6 7 8 9 10 11 12 13SNR[dB](a)(b)Figure 4. The IPER in packet mode and IPDC mode: (a) Doppler frequency= 10Hz; (b) Doppler frequency = 80Hz.first through a Rayleigh fading channel and then through anadditive white Gaussian noise (AWGN) channel. In thereceiver, the data processing is performed in reversed order.The physical fading channel design follows the COST 207model in typical urban reception conditions [13], which hasbeen commonly used for wireless broadcasting simulations.B. Simulation ResultsThe IP packet error rate (IPER) is an important criterion tomeasure the performance of ‘WBC over T-DMB’. For thepurposes of simulation, at the service layer 5000 distinct WBCsegments were produced by means of the ADP protocol andsent down to the link layer for broadcasting using the followingparameters: WBC segment size - 4016B, IP packet length –1024B, ADP (n=6, k=4). To test the IPER in offline mode, thelink layer first stores all IP packets into a database, andprocesses them one by one. The MPE-FEC frame has 1024rows and 255 columns. At the physical layer, the followingparameters were used: WBC source rate – 384 Kbps, datalength in 24ms – 9216 bits, convolutional code rate – 1/2,interleaving depth – 16, cyclic prefix ratio – 0.25.Figures 4(a) and 4(b) show the IPER simulation results forthe packet mode and IPDC mode when the Doppler frequencyis equal to 10Hz and 80Hz, respectively. The results confirmthat the IPDC mode outperforms the packet mode with2approximately 2.5dB SNR gain at IPER value of 10 − . Theresults also show that the transmission modes III and II havebetter anti-Doppler effect resistance than modes IV and I. Thereason for that is that they use higher subcarrier frequency(8kHz and 4kHz, respectively) than transmission modes IV andI (2kHz and 1kHz, respectively), which gives better results infading environments. So with same Doppler value, thenormalized Doppler frequency’s order is: mode III< mode II


[5] O'Droma M. S. and I. Ganchev. “Enabling an Always Best-ConnectedDefined 4G Wireless World,” Annual Review of Communications,Volume 57 (Chicago, Ill.: International Engineering Consortium, 2004),ISBN: 1-931695-28-8, Pp. 1157-1170. 2004.[6] Byungjun Bae, Joungil Yun, Sammo Cho, Young Kwon Hahm, Soo InLee, and Kyu-Ik Sohng, “Design and Implementation of the EnsembleRemultiplexer for DMB Service Based on Eureka-147,” ETRI J., vol. 26,no. 4, pp. 367-370, Aug. 2004.[7] ETSI, Radio Broadcasting Systems; Digital Audio Broadcasting (DAB)to mobile, portable and fixed receivers, ETSI EN 300 401 V1.3.3, May2001.[8] ETSI, "Digital Audio Broadcasting (DAB); Data Broadcasting - MPEG-2 TS streaming," ETSI TS 102 427, V1.1.1, July 2005.[9] Ji Zh., I. Ganchev, M. O'Droma. “Efficient Collecting, Clustering,Scheduling, and Indexing Schemes for Advertisement of Services overWireless Billboard Channels,” Proc. of the 15 th International Conferenceon Telecommunications (ICT 2008), Pp. x.1-x.6, St. Petersburg, Russia,16-19 June 2008. ISBN 978-1-4244-2036-0. DOI10.1109/ICTEL.2008.4652649.[10] Ji Zh., I. Ganchev, M. O'Droma. “Reliable and Efficient AdvertisementsDelivery Protocol for Use on Wireless Billboard Channels,” Proc. of the12 th IEEE International Symposium on Consumer Electronics (IEEEISCE 2008), Algarve, Portugal, 14-16 April, 2008. DOI10.1109/ISCE.2008.4559488.[11] ETSI, "Digital Video Broadcasting (DVB); DVB-H ImplementationGuidelines, " ETSI TR 102 377, V1.2.1, 2005.[12] Ji Zh., I. Ganchev, M. O'Droma. 2008. “Intelligent SoftwareArchitecture for the Service Layer of Wireless Billboard Channels,”Proc. of the 6 th Annual IEEE Consumer Communications & NetworkingConference (CCNC09), Pp. 1-2, Las Vegas, USA, 10-13 January <strong>2009</strong>.ISBN 978-1-4244-2309-5/09. DOI 10.1109/CCNC.<strong>2009</strong>.4784824.[13] COST 207 Report, "Digital land mobile radio communications,Commission of European Communities, Directorate General,"Telecommunications, Information Industries and Innovation,Luxemburg, 1989.60


RF SDR for Wideband PMRLing Gao, Ronan Farrell(Centre for Telecommunications Value Chain Research,National University of Ireland, Maynooth, Co. Kildare, Ireland;lgao@eeng.nuim.ie, ronan.farrell@nuim.ie)ABSTRACTTErrestrial Trunked Radio (TETRA) offers capabilitiesequivalent to the second generation of mobile phones withvoice and limited data capabilities. TETRA needs to evolveto satisfy increasing user demand for new services andfacilities as well as gleaning the benefits of new technology.An initial enhancement (TETRA Enhanced Data Service,TEDS) has been agreed. The enhanced TETRA servicesallows for more flexibility in the communication modesused, so as to provide adaptability in applications. Wepropose that it is possible to deploy Software Defined Radio(SDR) technologies into the basestation to economicallyprovide this level of flexibility and to further extend thecapability of TETRA services by deploying a WiMAXchannel into the proposed TETRA tuning range. Thusdelivering true broadband data service whilesimultaneously supporting the original and enhancedTETRA services.1. INTRODUCTIONTETRA is a Private Mobile Radio (PMR) standard that hasbeen developed by the European TelecommunicationsStandards Institute (ETSI) for the needs of the transport,civil and emergency services [1]. TETRAPOL is anotherPMR standard, developed by Matra NortelCommunications. TETRA and TETRAPOL are competitorsin the PMR market in Europe. In this paper we focus onTETRA services as it is a more recent standard thanTETRAPOL. For perspective, we will compare the radiocharacteristics between TETRA and TETRAPOL later(Table 1).There is increased interest in the delivery of broadbanddata services over the TETRA network, for example videoimagery of accident scenes. An enhanced form of TETRA(TEDS) has been agreed which can offer data rates of up to600 kbps [2]. However successful deployment of TEDSrequires additional spectrum to be allocated and this hasproved to be problematic. An investigation was carried outby ETSI which concluded that a single standardisedfrequency band cannot be agreed; however the concept of atuning range for enhanced TETRA services is gainingacceptance. In addition to the difficulty in agreeing astandardised spectrum allocation, enhanced TETRAsupports a range of communication modes depending onindividual user bandwidth and signal quality. This impliesa greater complexity on the radio systems. Though the newTETRA services will offer improved capabilities, it isnecessary to provide backward compatibility with existingTETRA users and as there are over 1000 networkscurrently deployed around the world [3]. The greatestchallenges will be experienced by the TETRA basestationswhich must support new and legacy systems. SDR,specifically in the concept of flexible hardware transceiversystems, offers an economical solution to both thechallenges of implementing TEDS and supporting legacysystems and provides a development route for new TETRAservices.This work is on integration of deploy a WiMAX subchannelinto the TETRA framework for true broadbandservices on demand. Similar initiatives, WiMAX overlayover TETRA demonstration for emergency call-handlingsystem by Alcatel Lucent and TelMAX project by Teltronichave also explored the issue of integration WiMAXchannels over TETRA bands. This work is focussed on theintegration of TETRA and WiMAX standards within asingle physical layer SDR transceiver rather than the use ofseparate radio front-ends.This paper will present the requirements for a SDRplatform with an investigation of various radioarchitectures to support the proposed and legacy schemes.Then we will show the implementation of our proposed RFreceiver architecture plus the design challenges for thisexperimental platform.2. COMBINING WIMAX AND TETRATETRA services were initially deployed in Europe in a 20MHz band between 380 and 400 MHz as two 5 MHz bandswith a 10 MHz duplex separation [1]. To deploy the newenhanced TETRA data services additional spectrum isrequired to complement the existing band. The ElectronicCommunications Committee (ECC) within EuropeanConference of Postal and Telecommunications61


Administrations (CEPT) has proposed a “tuning range”within which enhanced TETRA services can be deployed[4]. It recommends three bands within that tuning range,including the original TETRA band, as shown below(Figure 1). The tuning range requirements are furthercomplicated as non-European deployments have used otherfrequencies ranges. One particularly interesting aspect isthe Federal Communications Commission (FCC) proposednational public service network at 758-793 MHz [5] whichwould be attractive to any future TETRA-type network.Figure 1 system tuning rangeEnhanced TETRA allows for channel widths up to 150kHz, offering users a range of data rates, up to 600 kbps.This is a significant improvement on existing TETRAservices, however it does not offer data rates that wouldsupport full multimedia transmissions or rapid delivery oflarge files. Though TEDS has identified a maximumchannel width of 150 kHz, there is nothing inherent in theTETRA framework that prevents wider channels to beused. We propose that WiMAX (IEEE 802.16e) offersfeatures that are highly suited to TETRA-type applicationssuch as quality-of-service guarantees and scalable OFDMaccess. The WiMAX standards allows for 1.25 MHzchannel [6] which would allow up to three 1.25 MHzWiMAX channel to be deployed with the remainingspectrum then used to support voice and data serviceswhether using TETRA or TEDS, thus maintaining legacysupport (Figure 2).Figure 2 5 MHz TETRA channelThe key advantage to using the WiMAX standard isscalable OFDM access schemes (OFDMA) where users aredynamically allocated bandwidth as needed for theirapplication, according to their quality of service metric andallow users to obtain bursts of data throughput of up to 6Mbps when needed. WiMAX presents low cost of deliveryof higher data rates over large geographical areas and alsoperform very well in mobile conditions. With WiMAX’senhanced channel efficiency of up to 5 bits/hertz, greaternumber of users plus applications can be supplied.The use of high data rate OFDMA modulations bringsin challenging requirements for the transmitter in terms ofspectral quality and Error Vector Magnitude (EVM). Alsothe receiver faces some difficulties. The high EVM requiredis difficult to attain because it demands a high Signal-to-Noise Ratio (SNR) from the Low Noise Amplifier (LNA),about 35 dB. Other challenges are that the receiver mustexhibit low power consumption, high bandwidth and highdynamic range. [7]If basestations are to be designed using full channelcapture and channelisation in the digital domain,implementing this WiMAX sub-channel requires only asmall modification of the software implementation of thephysical layer and then subsequently a separate WiMAXstack.3. SOFTWARE DEFINED RADIO PLATFORMREQUIREMENTSTo develop a new system suits our proposal, the main radiocharacteristics of the TETRA, TEDS, TETRAPOL andWiMAX standards are studied as follow:Table 1 Compare radio characteristics of TETRA, TEDS,TETRAPOL and WiMAXTETRA TEDS TETRAPOLMobileWiMAXFrequency(MHz)380-410 350-470 80/380/450 410-470,758-793Spectrum Two 5 additiona similar to similar toAllocation MHz l 5 MHz TETRA TETRAbands bandsDuplex 10 10 similar to similar toSpacing(MHz)TETRA TETRAChannelBW (kHz)ChannelSpacing(kHz)Access TDMAScheme FDMAModulation π/4DQPSKTx Power(dBm)25 25-150


RxSensitivity(dBm)Efficiency(bits/Hz)-103 to-106similartoTETRA-113 to-1111.4


Figure 3 Proposed Test PlatformThe equipments needed are Rohde Schwarz VectorSignal Generator SMU, Rohde Schwarz Vector SignalAnalyzer FSQ, PC, low cost experimental SDR systemMARS from IMWS NUIM. We plan to getTETRA+WiMAX I&Q analog signals from R&S vectorsignal generator SMU200, connect it to R&S vector signalanalyzer FSQ. Use R&S matlab transfer toolbox to get theIQ files from FSQ. The reason for doing this is due to theinternal IQ files within the firmware of the SMU200 is notavailable to users. Then we transmit the IQ data to theMARS transmitter and our new designed superheterodynereceiver (Figure 3). This platform requires further work tomeet linearity and noise requirements. The main issues thatneed to be addressed are gain, matching networks,oscillator performance and singal/power level. Then wewill connect Tx & Rx to the FSQ to see how the TETRA +WiMAX signals perform.5. RECEIVER IMPLEMENTATIONThe SDR receiver is implemented using as many off theshelf parts as possible. The receiver implementationdiagram is shown in figure 4.from 450 MHz to 6 GHz. Although 380 MHz to 480 MHzis out of this LNA frequency range, we re-designed thematching network then simulated it in Agilent AdvancedDesign System tool. An Analog Devices part AD8348 waschosen as a downconverter. It has a conversion gain of upto 44 dB by the use of AGC, with a noise figure of 11dB,and IIP3 of 28 dBm. The AD8348 can be interfaced with adetector such as the AD8362 rms-to-dc converter to providean automatic signal-levelling function for the basebandoutputs. The ADF4360-7 is an integrated integer-Nsynthesizer and voltage controlled oscillator (VCO). TheADF4360-7 centre frequency is available and is set byexternal inductors. This allows a frequency of between 350MHz to 1800 MHz.The IF filter that we have chosen is an EPCOS SAWfilter. Its centre frequency is 140 MHz with a bandwidth of8.8 MHz. The ADL5530 is a broadband, fixed-gain, linearamplifier that operates at frequencies up to 1000 MHz. Thisprovides a gain of 16.5 dB and achieves an OIP3 of 37 dBmwith an output compression point of 21.8 dB and a noisefigure of 3 dB. The IF downconverter is the samecomponent as the RF stage, an Analog Devices partAD8348. Separate I and Q outputs of the mixers. Theoscillator signal comes from ADF4360-9, an integratedinteger-N synthesizer and voltage controlled oscillator(VCO). This configuration is capable of producing afrequency in a range from 65 MHz to 400 MHz, which thefixed centre frequency is 140 MHz. Two low pass filtersare followed which the bandwidths are 3.5 MHz for both Iand Q.Next the signal is digitised using two 16-bit AnalogDevices ADC’s capable of operating up to 80 Msps in thebaseband board developed by IMWS at NUIM. Thisdigitised information is then transferred to the hostcomputer for final processing and data extraction over aUSB2 interface.The receiver PCB board layout is then developed inEasily Applicable Graphical Layout Editor (EAGLE)(Figure 5).Figure 4 Receiver ImplementationWe will have one RF-IF board on top of a basebandboard.The RF bandpass filter is designed of 3 rd orderChebyshev filter operating a frequency range from 380MHz to 480 MHz. The LNA is Agilent ATF55143, with again of 17.7 dB at a noise figure of 0.6 dB and an IP3 of24.2 dBm capable of operating across a frequency range64


way. This will limit our ability to minimize adjacentchannel interference. To address the issue of varying subchannelwidths, it will be necessary to undertake fullchannel capture and subsequently digitally undertakechannelisation, filtering and de-modulation. If thisapproach is taken minimizing wideband noise contributionsfrom the electronics and adjacent channels becomesparticularly important. To investigate the interference issue,we had a look into blocker specifications for TETRA 25kHz QAM receiver is shown (Figure 6).Figure 5 Receiver PCB Board6. DESIGN CHALLENGESFrom a basestation perspective, this proposed test platformoffers a number of challenges, specifically maintainingnoise and linearity performance over such a range offrequencies and handling the different modes of operation.One of the challenges of designing a combinedcommunication systems is that it must remain compatiblewith legacy TETRA services. This is particularlychallenging as the TETRA specifications were designed forvery narrowband 25 kHz channels, specifically the figureson linearity and sensitivity. High sensitivity is needed asTETRA basestations are not typically as densely populatedas comparable mobile telephony systems. Complicating thematter is the needs for TETRA clients to be capable ofsustaining high receive power levels when close to suchbasestations [8]. The basis of our analysis was the need tobe compatible with legacy systems, while accepting thatsome compromises would be needed on adjacent channelspecifications as the legacy values are not appropriate toour wideband solution. As we are focussed on basestationradios, we are also assuming that receiver power levels canbe assumed to be low.The challenges for a SDR platform are focused on theRF-IF stages rather than the software framework.Specifically there are demanding receiver requirements onsignal sensitivity, adjacent channel rejection, and linearity.These issues were manageable when dealing withnarrowband signals at a specific frequency but becomemuch more challenging when dealing with a wide tuningrange. One particular issue is the problem of the transceiverfilter which must be wideband or reconfigurable in someFigure 6 Blocker Specifications for TETRA 25 kHz QAM receiverAt +/-75kHz offset, the level of interfering signal is -40dBm. At +/-150kHz offset, the level of interfering signalis -35dBm. At +/-350kHz offset, the level of interferingsignal is -30dBm. At +/-1MHz offset, the level ofinterfering signal is -25dBm. WiMAX signal has to belower than -35dBm/-30dBm. The max tolerated inputpower is 0 dBm. The filter specs and how far we putWiMAX channel next to TETRA channel are critical.7. CONCLUSIONIn this paper, we have reviewed the TETRA, TEDS,TETRAPOL and WiMAX standards. A new combinedsystem specification for the transceiver has been presentedto show how a WiMAX sub-channel can be integrated intoa TETRA channel and retain legacy compatibility. Wefocused on RF frontend receiver architectures with adiscussion of the relative benefits of homodyne andheterodyne architectures. The challenge of adding abroadband channel into the existing TETRA framework iscomplex and places significant constraints on futureTETRA receivers, but we propose that following asoftware-defined radio philosophy allows forimplementation with minimal additional hardwarecomplexity. Our next step is to adapt the LINGsuperheterodyne receiver with an existing MARStransmitter and demonstrate this proposed reconfigurableradio platform. If successful, this approach may allowfuture TETRA users to avail of broadband data ratesminimal additional cost for either the user or thebasestation provider.65


ACKNOWLEDGEMENTSThe authors wish to thank Philippe Mege, Gilles Latoucheand Laurent Martinod of EADS Secure Networks for theirassistance and support. Also the authors extend thanks tothe sponsors EADS and IRCSET for the PhD programme.REFERENCES[1] Whitehead, P., “The other communications revolution[TETRA standard]”, IEE Review, volume 42, pp. 167-170,1996[2] Nouri, M., Lottici, V., Reggiannini, R., Ball, D., Rayne, M.,“TEDS: A high speed digital mobile communication airinterface for professional users”, IEEE Vehicular TechnologyMagazine, volume 1, issue 4, pp. 32 - 42. 2006[3] Juan Ferro, “A TETRA market overview”, TETRAExperience China, TETRA association, 2006[4] Draft ECC Report on Public Protection and Disaster ReliefSpectrum Requirements, ECC/CEPT/102, ElectronicCommunications Committee, September 2006[5] Don Bishop, “Coming to America: TETRA-One way oranother”, Mobile Radio Technology (MRT) Magazine,January, 2001[6] IEEE 802 Working Group,http://standards.ieee.org/getieee802/802.16.html[7] Luis Abraham Sanchez-Gaspariano, Alejandro Diaz-Sanchez,“IEEE802.16e design issues and transceiver architectureselection for mobile WiMAX systems”, IEEE ComputerSociety, February, 2008[8] Private conversation with Philippe Mege, Gilles Latoucheand Laurent Martinod of EADS Secure Networks.[9] G. Baldwin, L. Ruíz, R. Farrell, “Low-Cost ExperimentalSoftware Defined Radio System”, SDR Technical Forum2007, November 200766


Q-Learning for Cognitive RadiosNeil HoseyDept. of Computer ScienceNUI MaynoothMaynooth, Co. KildareEmail: nhosey@cs.nuim.ieSusan BerginDept. of Computer Science,NUI Maynooth,Maynooth, Co. KildareDiarmuid O’DonohueDept. of Computer Science,NUI Maynooth,Maynooth, Co. KildareAbstract—Machine Learning approaches such as ReinforcementLearning (RL) can be used to solve problems such asspectrum sensing and channel allocation in the cognitive radiodomain. These approaches have been applied to other similiardomains such as in mobile telephone networks and have shownmuch greater performance than the static channel allocationschemes used.The objective of this research is to use an RL techniqueknown as Q-Learning to provide a possible solution for allocatingchannels in a wireless network containing independent cognitivenodes. Q-Learning is an attractive algorithm for such a problembecause of the low computational demands per iteration. Manyof the current proposed techniques suggest using a negotiationpolicy between two nodes to decide on which channel each mayuse, however a considerable problem with this is the overheadinvolved in the negotiation involved between the nodes. Thispaper suggests an approach where each node acts as an individualindependant node, with minimal interaction with the other nodes.Results have shown that using such a technique gives fastconvergence on an optimal solution when correct rates are chosen.It has also shown that the algorithm is very scalable, in that as thenetwork grows, the state-action space does not grow sufficientlyto cause major memory or computational demands.I. INTRODUCTIONResearch in the area of cognitive radio (CR) has broadenedsignificantly in the past number of years since it was firstpresented by Mitola in 1999 [1]. It is now recognised asbeing an essencial replacement for the current regulation of theelectromagnetic spectrum where vast bands of usable spectrais being underutilised. One such example of this was in theUSA where the Spectrum Policy Task Force [2] had found thatfor a particular period on a police broadcasting channel, thetypical channel occupancy was less than 15% , while the peakusage was nearly 85%. This has led to much research intothe area of opportunistically accessing underutilised spectrumwhere no primary user is currently active.The goal of this research is not to focus on primary andsecondary users in a frequency domain, but rather to alloweach cognitive node to learn by its own mistakes. In this case,the agent considers the environment to include all other cognitivenodes, and any other transmitters working on the samechannels as being part of the environment. This will ensurean even distribution of channels not only for the cognitivenodes, but also for any other type of wireless communicationdevice. It is hoped that future work will look at primary andsecondary users working in the same environment using aQ-Learning approach. Much of the research in this area hasfocused on cooperative sensing where nodes within a cognitiveradio network share information about the environment. Therehas been very little work focusing on reinforcement learningtechniques such as Q-Learning, currently there is only oneother paper which looks at Q-Learning for channel allocationin cognitive radios[3]. There are several levels on which thiscan happen [4], which at the very least have a need for acontrol channel to pass information between nodes. This alonecan lead to massive overheads on a network as the number ofnodes and the amount of data shared increases. The proposedsolution is individual sensing where each node is a singlecognitive entity having the ability to acquire information aboutthe environment or network without the help of other nodesin its vicinity.This is achieved using a reinforcement learning algorithmknown as Q-Learning whereby the agent goes through a phaseof learning before it can converge on an optimal solution forchannel allocation. In this learning phase, the node makesdecisions on what channels to select pseudo-randomly, theoutcome of taking these actions will weigh strongly on whatdecisions are made later on. Once a node has finished learning,it can then make decisions on what it has learned. The abilityfor a node to be able to preempt whether a channel is going tobe in use before accessing it allows it to optimise bandwidthusage for itself and any other nodes that may be accessing thesame channel.The remainder of this paper is organised as follows.First, an overview of reinforcement learning is presented,along with an in-depth look at Q-Learning and how it isapplicable to this domain. Details on how Q-Learning hasbeen applied in channel selection in cognitive networks areprovided. Results and outcomes of simulations performedare then given, followed by current and future work. Finally,conclusions from this work and possible future work in thisarea are provided.II. REINFORCEMENT LEARNINGA. Application of Reinforcement LearningReinforcement Learning is a machine learning techniquewhereby an agent interacts with an environment in the hope ofachieving a goal. This interaction occurs on a continual basiswith the hope of the agent being able to learn to function67


in an optimal fashion within that environment. The way inwhich the agent interacts with the environment is through aseries of actions that can be performed. These actions can havepositive or negative outcomes which can, over time, be usedto determine how best to work in the current environment. Ateach point in time, an agent can be in a particular state, withthe ability to choose an action based on what it has learnedin previous iterations.The overall goal is to find an optimal policy that maps eachstate to an action an agent should take in those states [6].Figure 1 shows how the agent interacts with the environmentand uses this to determine its next state.• The reward, r t , that is returned is then stored or learnedfor that state action pair, and the above process is repeated.The goal of repeating this process is for the agent to findan optimal policy π ∗ ∈ π for each state s t ∈ S in arecursive manner. The fact that the Q-Learning algorithm canconverge on π ∗ without having any prior knowledge of theenvironment makes it very suitable for cognitive radio channelselection because of the unpredicability of other nodes and theelectromagnetic spectrum.The algorithm can be described as a simple value iterationupdate as shown below:Q(s t , a t ) ← Q(s t , a t ) + α(s t , a t ) × [r t+γ × max a Q(s t+1 , a) − Q(s t , a t )] (1)Fig. 1.Agent-Environment RelationshipTo represent this formally, we assume the agent receives thenext state from the environment as shown in figure 1 at eachiteration of the algorithm, s t ∈ S, where S is the set of possiblestates and t = 0, 1, 2, ... for each individual discrete timestepor iteration. On receiving this state, an action is chosen, a t ∈A(s t ) where A(s t ) is the set of possible actions that can betaken in state s t . On taking this action, the agent observesthe result, and receives a reward r t+1 where r t+1 ∈ R. Aftertaking this action, the agent has now moved into a new state,s t+1 . At each iteration of the algorithm, a policy is createdthat maps the action taken, a t to the state s t and this policy isdenoted by π t (s t , a t ). The way in which this mapping occursis dependent on each RL algorithm and is usually based onone of a number of action selection strategies which will bedescribed in the next section.B. Q Learning AlgorithmQ-Learning is an RL off-policy temporal-difference learningalgorithm introduced by Watkins in 1989. The algorithm worksby learning an action-value function that gives an expectedutility of taking an action in a particular state and followingthat policy thereafter [7]. The environment in which the agentsexist can be modelled as a Markov Decision Process. Theagent-environment shown in Figure 1 consists of a number ofsteps:• Agent examines state s t ∈ S• Action a t ∈ A is taken based on s t• A transition occurs as a result of action a t being taken, anda new state s t+1 is taken into account. A reward is generatedbased on this transition, r t .where α(s t , a t ) is the learning rate where 0 < α ≤ 1 andwhich represents to what extent newly acquired informationwill be taken into account. A learning rate of 1 will meanthat only the most recent rewards will be taken into accountwhereas a learning rate of 0 will mean the agent will learnnothing, and any current reward will be discarded. Thediscount factor, γ, 0 < γ ≤ 1, decides how important futurerewards are for the agent. Another thing that makes theQ-Learning algorithm suitable for this type of problem isbecause it has been shown [8] that Q-Learning will convergewith a probability of 1 as long as each state action pair isvisited infinately as the learning rate approaches zero. Theway in which the Q function in Equation 1 is implementedcan be shown in the pseudocode:For each episode, while s is not terminalSense environment and state sFor each iteration:Choose a from s using certain policyTake action a, observe output, r, s t+1Update Q value for state-action pair (eqn. 1)s ← s t+1LoopLoopThe electromagnetic spectrum environment that the agentis working in is very unpredictable, making it suitable to usean off-policy RL algorithm such as Q-Learning so as to allowa period of random exploration before following the targetpolicy of the agent. The policy used in selecting which actionto take is dependent on the type of policy used. The simplestexample is to select the action with the greatest reward forthat state, although this may not always lead to an optimalsolution as it would lead to a totally greedy policy that wouldnot explore parts of the state space that would not appear tobe advantageous but could lead to an optimal solution in thefuture.ε-Greedy is an example of a strategy that overcomes thisproblem. It does so by only selecting the best action 1−ε of68


the time and another action is chosen randomly selected forthe remainder of the time, ε. The value of ε is in the range0 < ε ≤ 1. The higher the value, the more random explorationwill occur. A similiar strategy known as ε-decreasing strategyis what is used in this experiment. The main differencebetween this and ε-Greedy is that ε decreases over a periodof time so that the agent goes through a period of randomexploration or learning before becoming totally exploitative.III. ALGORITHM IMPLEMENTATIONTo explain how Q-Learning was used for this problem, anexamination of how each state, action, and reward is structuredis provided. First, the available spectrum was broken up into anumber of channels which could be used for communication.The number of channels available, C, that was used in thisexperiment was 4, but can change depending on availablespectrum. This amount of channels was chosen to providea simple case, although any number of channels could bechosen as this implementation does not suffer from scalabilityproblems.The state was defined as a 2-dimensional structures t (tr t , if t ) (2)where the 1st dimension, tr t , represents the number ofchannels that a node is currently transmitting on at that time,0 < tr t ≤ C and, the 2nd dimension, if t , represents thenumber of channels that an agent attempted to transmit onbut were in use at that time, 0 ≤ if t ≤ tr t . The number ofinterfering channels is based on the number of channels inuse, either by other nodes or through interference that arewithin range of the node. Through sensing the environment,a binary 2 dimensional vector, if t (c) is populated as perequation 2.{1 if channel c at time t is in use.if t (c t ) =(3)0 otherwisewhere c t = 1, 2, ..., C. This vector is scanned at eachiteration upon taking an action to move to the next state. Asthe network may be spread over a large area, and would notbe a fully-connected network as a result of wirelessrestrictions on distance, a connectivity vector, V is used tostore what cognitive nodes are within range. This is onlyneeded for simulations as a real world implementation wouldonly be able to sense nodes within the wireless transceiversmaximum distance. There are 3 possible actions that can betaken by an agent at a particular timestep:• I - do nothing.• II- acquire a channel.• III - drop a channel.Action I will not acquire or drop any channels, action IIwill acquire a channel for transmission, and action III willdrop a channel that it already has in use. The channel thatis selected for drop or acquire is completely random in thisimplementation, but future work may allow an agent to learnwhich channels are good and which are bad.There is an immediate reward or punishment received fortaking action a t while in state s t . Although there are manyposible ways in which to calculate a reward in such aninstance, it is important to ensure that the agent doesn’tact in a greedy manner, by acquiring as many channels aspossible leaving other nodes in the network, or surroundingnetworks starved of bandwidth. The proposed function shownin equation 4 ensures that the number of active channels isproportionately greater than the number of intefering channels.The weighted interfering channels forces a punishment toany agent that acquires a hnigh proportion of channels in anenvironment where channel usage is high.r(s t , a t ) = tr t − (tr t × if t ) − 1 (4)This function ensures the channels are evenly distributedbetween all the nodes of the network, and also for otherdevices operating in the same environment. A simple examplewould be where an agent is transmitting on 2 channels andhad also attempted to transmit and failed on another channelafter action 1 or acquire channel was chosen, the state theagent would be in is s t (2, 1). So the reward calculated basedon equation 4 isr(s(2, 1), 1) = 2 − (2 × 1) − 1 = −1 (5)In this case, it is a small punishment that the agent receivesfor this state-action pair. It shows that because the agent istransmitting on 2 channels, but interfering on one of themchannels, the reward is negative.As mentioned in the previous section, the policy by whichthe actions are chosen for Q-Learning is based on a particularstrategy, and in this case a hybrid on ε-Greedy known as theε-Decreasing strategy. The idea of this is, as explained in theprevious section is to allow the agent to go through a periodof random learning or exploration before exploiting what ithas learned.Many of the simulations that we have carried out have useda value for ε of 0.8. An example of how action selectionoccurs is as follows. In the first iteration of the algorithm,there is a 80% probability that a random action will be chosen(exploration) and a 20% probability that the best (or max Qvalue) action will be chosen (exploitation). As the algorithmprogresses, this value decreases to allow the agent to slowlytransform into an agent that selects the best action based onwhat it has learned rather than randomly hopping through thestate space, thus exploiting the information it has gathered.As ε approaches zero, the agent should eventually convergeon a stable, non-greedy state that uses an appropriate amountof channels without causing interference to other nodes inthe environment. It then converges on a fixed state where theagent is transfering over a fixed number of channels until theenvironment changes enough to warrant a re-learning. Thesechanges could be due to other nodes leaving or joining thenetwork or other outside interference.As this value reduces overtime and effectively controls69


whether the agent is in a learning mode or not, it shouldbe possible to adjust this value during the running of thealgorithm. There are numerous reasons why we would wantto do this but the most important is that the electromagneticspectrum is an ever changing environment, and as long as itis slow changing, the agent can re-enter the learning phasewhen the environment has changed enough as to make whatit has learned redundent. This forces the agent to re-learn sothat it can effectively operate in the altered environment again.This can be an ongoing process where the agent goes in andout of a learning phase whenever some metric that measuresenvironmental change reaches some threshold. The functionused in decreasing ε is the same as the one used in decreasingα, which is explained below.The convergence of a discrete algorithm to an optimal policyis vital for this algorithm, and Watkins and Dayan have provedthat Q-Learning does converge [8] as long as a number ofconditions hold. The learning rate, α t , where 0 ≤ α ≤ 1decreases at each iteration. We looked at 2 ways of decreasingα t , the first being to simply decrease it by a fixed value eachtime as shown in Equation 6.α t = α t − (α t ÷ EstNumI) (6)where EstNumI is the estimated number of iterationsneeded for the algorithm to converge. Another method fordecreasing α suggested by Watkins was to decrease it basedon the number of times a particular state-action pair, α(s, a)is visited.1α(s, a) =(7)n(s, a)where n(s, a) is the number of times that state action pair hasbeen visited. This alpha value will give 1, 1 2 , 1 3, .. at each visitto a particular state-action pair.Eventually the algorithm needs to converge on a particularstate which tells the agent the optimal amount of channels itcan transmit on without causing interference. This is achievedby allowing the policy to continue choosing actions until action0, ’do nothing’ is chosen for a fixed number of iterations,meaning that the algorithm has reached a stage where it willstay in the same state forever.Finally, the greatest advantage of using this particularimplementation is the small amount of memory andcomputational requirements needed. The state-action spacewould be considered substantially smaller than many otherproblems that use Q-Learning. In a 4 channel network thereis a maximum of 16 possible states, with 3 possible actionsmaking a total state-action memory space of 48. Assumingthe use of floating point numbers, the memory space usedstoring these Q values is only 192 bytes.A. SimulationIV. RESULTS AND FINDINGSThis work was based on a simple 4-node network as shownin figure 2 for simulation purposes.Fig. 2.4-Node NetworkWe simulated an environment where there was 4 channelsavailable for transmission, although the number of channelscould be altered for each individual node to simulate a realworld environment where there may be other radio operatingin the same environment. As we wanted to model this asaccurately as we could, we used nodes that could onlytransmit on 1 channel and in some cases 2 to ensure thatthe algorithm, for each agent or node would converge onan optimal solution each time in different scenarios. Theway in which interference was represented in the simulatedenvironment was using a 2 dimensional vector. An exampleof one is shown in the table below.N0 N1 N3 N2N0 0 1 1 0N1 1 0 0 1N3 1 0 0 1N2 0 1 1 0Table 1. Sample interference vector over all nodes where 0represents if two nodes are not in interferable range and 1 ifthey are in range.Each node, as mentioned above has an interference vectorwhich holds details on which nodes are within range that couldcause interference upon acquiring a channel. The connectinglines in Figure 2 represent 2 nodes being within interferingrange. For example, node 0 is within interfering range of nodes1 and 3. For the purpose of these simulations, this table wasused for determining interferable nodes within the network. Ina real world case, each node would need to determine whichnodes are within range themselves. We implemented this inJava, as this was the first authors main language. Althoughthis sufficed, future work will include implementing this on anumber of SDR’s, which would require a C implementation.B. ResultsThe simulations carried out were mainly focused on differentrates and different decreasing factors for ɛ, γ and λ.We also looked at how the overall interference throughout the70


network reduced as the agents neared convergence. Finally, welooked at using different forms of action selection strategies.1) Experiments and final states: As discussed above, foreach independent node, it will eventually settle on a statewhich would hopefully maximise spectrum usage withoutcausing any interference with other nodes. Simple 4-nodeexperiments have shown that this is the case. A sampleof some of the experiments is shown in Table 2 and haveshown good results. In each case, the output shows that thealgorithm has converged on an optimal solution that uses themaximum amount of channels possible for each node withoutcausing interference.Node0 Node1 Node2 Node3 Output1 4 4 4 4 2,2,2,22 4 1 4 4 2,1,2,23 4 1 1 4 3,1,1,34 4 4 1 1 2,2,1,15 4 2 3 4 2,2,2,26 4 1 4 1 2,1,2,1Table 2. Output of results of channel usageThe values shown in Table 2 for each experiment representthe number of channels available for transmission for eachnode. The output is the number of channels the algorithmconverged to for each node respectively. In each of these cases,the number of channels is the maximum amount possiblewithout causing interference with other nodes. The channelshave also been divided equally without any communication orpassing of information.2) ɛ-reduction: The speed at which the algorithm convergeson a solution depends alot on how ɛ is reduced. The fasterthis value is reduced, the quicker the algorithm moves intothe exploitation phase. Results have shown that the faster ɛ isreduced the less chance it has to explore the state-action spacein full and thus usually results in the algorithm convergingon a bad solution. Figure 3 shows sampled results of how theinterference of a single node transitions during the course ofthe learning of the environment every 1000 iterations.It can be seen that for a large amount of the iterationsat the beginning of the algorithm, the node is interferingon all four channels available, but as it begins to transitioninto the exploitation phase, this number slowly digresses andeventually does not interfere on any channel. It has beenexplained that one of the rates at which ɛ decreases is basedon the function:ɛ t = ɛ t − (ɛ t ÷ EstNumI) (8)In the case above, a value of 100,000 has been set forEstNumI, meaning that at each iteration, it will be decreasedby ɛ t ÷ EstNumI. If EstNumI is decreased substantially,thus causing a faster decrease in ɛ, the algorithm will act in amuch more erratic manner and will fail to converge on a fairsolution. Figure 4 illustrates an example where EstNumIis set to 15000. It shows how in comparison to Figure 3,Fig. 3. Single Cognitive Node Interference Pattern over the course of learningenvironment in a 4 node 4 channel network.the algorithm fails to reduce the number of channels it isinterfering on, and although it converges much quicker than ahigher EstNumI, each node will not acquire a fair amount ofchannels, with the one below acquiring all 4 channels, causinginterference on the network with some nodes, whilst causingother nodes to not acquire any.Fig. 4. Single Cognitive Node Interference Pattern where the algorithm doesnot converge on a good solution.Therefore the selection of EstNumI, must be large enoughto allow an agent to traverse the state-action space enoughbefore reaching an exploitative phase.3) ɛ-Decreasing vs. ɛ-Greedy: The action selection strategywhich appeared to work best was ɛ-Decreasing. Many ofthe experiments carried out using ɛ-Greedy resulted in thealgorithm taking much more iterations to converge. While incomparison to ɛ-Decreasing, which converged very quickly aslong as robust values were chosen for the different rates. ɛ-Decreasing allowed for greater exploration at the beginning,and greater exploitation towards convergence. This suited the71


needs of the experiments as the goal was to find an optimalsolution for channel allocation.4) γ-Selection: How much of the future rewards the algorithmtakes into account during the exploration phase verymuch depends on the initial selection of the discount factor, γ.It has already been shown that Q-learning will converge with aprobability of 1, but what it converges to may not be good forwhat the algorithm hopes to achieve. It has been noted thatthe smaller γ is, the less the probability that the algorithmcould converge on a good solution, and usually resulted in anumber of the nodes acting in a greedy fashion while othernodes not being able to acquire any channels. A lower boundon the learning rate was discovered, and if it was set below thisbound, these problems would occur. As long as γ is greaterthan the lower bound specified in Equation 8, the algorithmwill strive to find a long term goal as opposed to only focusingon current goals.γ ≥ 0.5 (9)It was also noted that any value over this threshold madevery little difference in the both the number of the iterationsneeded to converge and the solution that the algorithmconverged to. What is different between λ and the othervariables is that it is fixed. It does not decrease or increasethroughout the course of the algorithm. This threshold is basedon the fact that γ is a discount factor for future awards. Asthis approaches zero, the less the algorithm considers futurerewards of importance. This makes each individual agent workin a greedy fashion and only consider current rewards.5) α-Selection (initial): The learning rate, α as discussedearlier determines how much the algorithm takes into accountwhat it learns at each iteration over what it has previouslylearned. In comparrison to other uses of Q-Learning, theimportance of alpha is quite low. It has been noted, throughmultiple experiments, that as long as there is robust selectionfor ɛ and γ, it does not matter what initial value is selected forα as long as it follows the basic criteria of being decreasedappropriately through time (iterations) and in the range0 ≤ α ≤ 1.V. FUTURE WORKThis research is still in the early stages, and current work isfocusing on implementing such an algorithm on a number ofSoftware Defined Radios (SDR) to develop a working exampleof how Q-Learning could be used in solving this problem.The Maynooth Adaptable Radio System (MARS) has beenunder development at NUI Maynooth’s electronic engineeringdepartment since 2004 [9].Current software demonstrations allow for transmission ofimages using the IRiS software architecture developed atthe CTVR, Trinity College, Dublin and a large number ofwaveforms using a MARS demonstration application froma transmitter to a receiver. Implementing the Q-Learningalgorithm on the MARS boards will give us a working exampleof a machine learning algorithm to this problem in a real worldFig. 5.MARS Receiver and Transmitter Boardsenvironment, although there are a number of challenges toovercome first. Currently the transmitter and receiver boardsrun off separate machines, and since cognitive radios needto be fully duplex for both scanning and receiving, there isa need to have both boards running on the same machine.Current work will focus on determining if the kernel will beable to recognise both boards running concurrently.There are also a number of improvements to be made tothe current algorithm. For example individual channels can beincluded so that agents will be able to differentiate betweengood channels and bad channels as opposed to whether anumber of channels to use is good or not. This would vastlyimprove the performance of the algorithm in terms of avoidinginterference.Although this is a completely independent learning algorithmwith no co-operation with other nodes, it may be worthexploring what advantages some limited communication mayhave between nodes. One such example could be passinginformation about bad channels between nodes based on somethreshold of the Q-Values. It has been mentioned that thereis no communication between nodes, but in the real-timeimplementation, there may need to be a small low bandwidthcontrol channel for communicating which channel a node isgoing to transmit on to another node.It may also be worth exploring how well a centralisedapproach would work using the Q-Learning algorithm byusing a master-slave setup in a network. This would involveone fat node doing much of the computation and using acontrol channel to transmit channel usage information backand forth.VI. CONCLUSIONIn this paper, we have presented a simple RL techniquefor channel assignment in a network of independent cognitivenodes. This is achieved using a self-learning scheme basedon a TD learning algorithm known as Q-Learning using a 2dimensional state in an unknown environment. Q-Learningssuitability to this is has been shown as it can take in unknownsituations and act upon them using its own experiences.Simulations carried out on a 4 node network with 4channels have shown good results in fair non-greedy channel72


assignment, so much so as to pursue implementing thison a number of Software Defined Radios in a real timeenvironment with the future goal being to use this as abenchmark to measure other machine learning algorithmsabilities to perform this task.The most significant advantage of this implementation is howsmall the memory, bandwidth and computational requirementsare in comparison to many other cognitive radio channelassignment schemes.VII. ACKNOWLEDGEMENTSThis work has been carried out with the support ofScience Foundation Ireland (SFI) through the Centre forTelecommunication Value Chain Research (CTVR), theInstitute of Microelectronics and Wireless Systems at theNational University of Ireland, Maynooth and Irene Macalusoof the CTVR, Trinity College, Dublin.REFERENCES[1] J. Mitola and G. Q. Maguire, Cognitive Radio:Making Software RadiosMore Personal, IEEE Personal Communcations 1999[2] FCC Spectrum Policy Task Force Report, ET Docket No. 02 - 135,2002.[3] Husheng Li Multi-Agent Q-Learning of Channel Selection in Multi-UserCognitive Radio Systems: A Two by Two Case[4] S. M. Mishra, A. Sahai and R. W. Brodersen, Cooperative Sensing amongCognitive Radios, In Proc. of the IEEE International Conference onCommunication(ICC), pp. 1658 - 1663, 2006.[5] N. Lilith and K. Dogancy, Dynamic Channel Allocation for MobileCellular Traffic using Reduced-State Reinforcement Learning, WCNC2004 pp. 2195-2200.[6] L. Kaelbling, M. L. Littman and A. W. Moore, Reinforcement Learning:A Survey, Journal of Artificial Intelligence Research 4-237-285, 1996.[7] C. J. Watkins, Learning from Delayed Rewards, Ph.D Thesis, Cambridge,1989.[8] C. J. Watkins, Q Learning, Machine Learning , Volume 8 pp279-292,1992.[9] R. Farrell Software-Defined Radio Demonstrators: An Example andFuture Trend, Centre for Telecommunications Value Chain Research,Institute of Microelectronics and Wireless Systems, National Universityof Ireland Maynooth, Maynooth, Co. Kildare, Ireland73


Section 2BGEOCOMPUTATION74


Extracting Localised Mobile Activity Patterns fromCumulative Mobile Spectrum RSSIJohn Doyle ∗ , Ronan Farrell, Seán McLoone, Tim McCarthy, Peter HungInstitute of Microelectronics and Wireless Systems,National University of Ireland Maynooth,Maynooth, Co. Kildare,IrelandEmail: ∗ jcdoyle@eeng.nuim.ieAbstract—Techniques for observing the flow of people arecreating new means for observing the dynamics between peopleand the environments they pass through. This ubiquitous connectivitycan be observed and interpreted in real-time, throughmobile device activity patterns. Recent research into urbananalysis through the use of mobile device usage statistics haspresented a need for the collection of this data independentlyfrom mobile network operators. In this paper we demonstratethat by extracting cumulative received signal strength indication(RSSI) for overall mobile device transmissions, such informationcan be obtained independently from network operators. Wepresent preliminary results and suggest future applications forwhich this collection method may be used.Index Terms—RSSI, Erlang, human monitoring, geo-temporalweighting.I. IntroductionMapping applications which present the flow of humanactivities are now becoming increasingly common, one of themain contributions to this is the vast amounts of informationmade available from mobile devices. In 2007 the number ofmobile phones in Ireland numbered 5.3 million [1] while thehuman population numbered 4.3 million [2]. It is quicklybecoming the norm in the developed world that mobile phonedevices are outnumbering people. The developing world toohas seen a rapid surge in mobile device numbers as mobilenetworks are often easier and cheaper to install compared tothat of landline networks.As a result of this ever expanding technology, activities thatonce required a fixed location and connection can now beachieved with higher flexibility, which enables users to act andcommunicate more freely. The usage patterns obtained frommobile device activity can enable us to model the dynamicsof human flow in modern environments [3].The ability to detect such activity has become increasinglyimportant due to growing interest in the provision of locationbased services (LBS). LBS researchers have developedtechniques for the detection of people in the proximity of anarea other than through examining mobile usage statistics. Onecommon approach is to use vision based techniques whichutilises camera surveillance systems to identify crowd numbersand behaviour [4], [5], [6]. However, theses types of systemsinvoke certain social issues with regards to privacy [7], [8].As stated in Doyle et al. [9], the mobile phone usage statisticcommonly employed in mobile usage mapping applicationsis a measure of network bandwidth used. Typically, this iscollected at a base station within a mobile operator’s network,or by use of special software installed on mobile phones. Themetric by which this activity is measured is known as an Erlang.An Erlang is one person-hour of phone use, which couldrepresent one person talking for an hour, two people talking forhalf an hour each, 30 people each speaking for two minutes,and so on [10]. A more modern interpretation of this metricwould be to consider the quantity of digital data transferred,regardless of the form of communication, such as voice, SMS,and data. This method was valuable in the past due to therestricted nature of mobile telecommunications which werefundamentally voice-only networks. Modern networks have anprogressively diverse range of usages which do not linearlycorrespond to intensity of communication. For instance textmessaging uses very little bandwidth though is an importantform of communication.As an alternative to collecting data throughput measurements,we have adopted a technique for monitoring the cumulativeelectromagnetic energy in the frequency band ofclient-side mobile phone transmissions (i.e. mobile deviceto base station transmission band). By analysing these RSSIvalues over time and space through a collaborative network ofsensors, we propose that results can be obtained that are ofcomparable quality to the more invasive network bandwidthmetrics (Erlang). Such measurements can be easily achievedusing well known circuitry for Received Signal StrengthIndication (RSSI) [11], [12]. The information gathered isinherently anonymous due to the absence of informationdecoding. As a result, it is impossible to deduce individualidentities or phone information content from the raw datacollected and stored in the proposed system, thus avoidingthe potential ethical issues faced by both vision based andnetwork operator polled systems.In the rest of this paper, we highlight the use of an energydetecting device to monitor mobile spectrum activity for thepurpose of mapping mobile device activity. Section II givesan overview of some related work in this field. Section III75


describes the proposal put forward by this paper. Section IVdetails the experimental setup adopted to measure the temporalRSSI data, from which useful information is extracted. SectionV presents the results of experiments carried out focusingon the collection of RSSI mobile device data under differentscenarios. Section VI summarises the conclusions of the workto date and outlines future directions for research.II. BackgroundThis section presents on overview of some work related tothe collection and analysis of human movement data. Thiscan be grouped into real time urban flow mapping, locationtracking and spectrum strength collection.A. Real Time Urban Flow MappingThe emergence of new mapping applications which presentthe flux of people in an attempt to demonstrate the dynamicsof metropolitan cities highlights the recent growth and interestrelating to tracking human flow on urban scales. Over the lastfew years this research area has seen steady growth with largeprojects starting in European and Asian cities. The monitoringof mobile phone usage patterns has been the major datasource used to extract the human behavioural patterns neededfor these applications. Other sources such as passive tollingof Bluetooth devices, as well as techniques including GPStracking and short range tracking have been utilised in thepast but theses do not scale easily in urban environments.Amsterdam Real Time [13] and Cityware ResearchGroup [14] are examples of such projects. The AmsterdamReal Time project aimed to construct a dynamic map ofAmsterdam, Netherlands, based on trace lines produced fromthe collection of GPS data relating to peoples movements.Each person volunteered in the experiment and was equippedwith a GPS receiver. This receiver fed the GPS coordinates ofthe volunteer to a central system in real time. Maps producedwere solely based on this GPS data. In the UK, the Citywareresearch group supplemented the pedestrian flow data typicallygathered as part of a space syntax analysis with data onBluetooth devices passing through pedestrian survey gates.To date there are two main methods for the gatheringmobile usage information: data collection at the operator level;and through modified mobile phone software. The first arearequires the cooperation of mobile operators to provide dataon a macro level of urban areas. Graz in Real Time [15],the Mobile Landscapes project [3], Real Time Rome [16] andBangkok Metropolitan Project [17] are examples of projectswhich utilised this network operator data.The Graz in Real Time project is a real time mobile phonemonitoring system based on cell phone traffic intensity, trafficmigration (hand overs) and traces of registered users as theymoved through the city of Graz.The Mobile Landscapes project collected network usagedata in the Milan, Italy. When combined with the geographicalmapping of cell areas, a graphical representation of theintensity of urban activities and their evolution through spaceand time was produced. From this they were able to detectevents such as national holidays and major sporting events.The Real Time Rome was MIT’s SENSEable City Laboratorycontribution to the 10th International ArchitectureExhibition in Venice, Italy. The project was the first exampleof an urban-wide real time monitoring system that collectsand processes data provided by telecommunications networksand transportation systems. It used location data from mobilephone subscribers provided by Telecom Italia, public busesran by a local transport company Atac and taxis run by thecooperative Samarcanda.Horanont and Shibasaki [17] presented an implementationof mobile sensing for large-scale urban monitoring in BangkokMetropolitan, Thailand. They used Erlang data from AdvancedInfo Service PLC (AIS), a leading mobile operator in Thailand.They showed that large scale monitoring of clusters of Erlangdata from mobile base stations were able to provide indirectinterpretations of spatial patterns of urban life and its temporaldynamics.However, there are difficulties with this approach, mostnotably the legal and privacy issues that prevent operatorsdelivering such information to outside researchers. In addition,even with best efforts, there is no guarantee that data fromtheses sources is always available, complete or accurate. Networkoperators continually optimise their network throughoutthe day, using temporary towers. This adds to the level ofuncertainty into these fixed point measurements as networktopologies become more dynamic. A more fundamental issuearises regarding spacial accuracy as the spatial resolution ofthe usage statistics is dependent on both the operators networktopology and base station hardware.As a result approaches have emerged which aimed to addressthese issues by placing embedded software applicationson the mobile devices to log data. Estonia group project [18]and MITs Reality Mining project [19] are examples of projectswhich utilise this approach.Ahas and Mark [18] tracked the mobile phones of 300 usersfor a social positioning application. They combined spatiotemporaldata from phones with demographic and attitudinaldata from surveys, creating a map of social spaces in Estonia.MITs Reality Mining project illustrated that it was possibleto extract common behavioural patterns from the activities of94 subjects. The subjects were issued with mobile phones preinstalledwith several pieces of software that record and sentresearch data on call logs, Bluetooth devices in proximity, celltower IDs, application usage, and phone status. This yieldsvaluable, person specific results but the solution may notbe easy to scale considering the large numbers needed torepresent urban and suburban populations.B. Mobile Phone Location TrackingMost indoor environment based localisation research todate has focused on the accurate localisation of objects andpeople using short-range signals, such as WiFi [20], [21], [22],Bluetooth [23], ultra sound [24], and infra-red [25]. Outdoor76


localisation is almost exclusively performed using the GlobalPositioning System (GPS).Otsason et al. [26] showed that an indoor localisationsystem based on wide-area GSM fingerprints can achievehigh accuracy, and is in fact comparable to an 802.11-basedimplementation. To date there are two major ways for mobilephone locations to be tracked in mobile networks, namelynetwork-centric and device-centric localisation. In networkcentricsystems, base stations make the measurements ofdistance to a mobile phone and send the results to a centralisedlocation at which the location of the mobile deviceis calculated. In device-centric systems, the handset performsthe calculation itself on the basis of environmental informationgathered from the network. Hybrid solutions are also possible,which try to combine the advantages of both.The American National Standards Institute (ANSI) andthe European Telecommunications Standards Institute (ETSI)stated that mobile positioning systems can be classified underthe following technologies: cell identification, angle of arrival,time of arrival, enhanced observed time difference, and assistedGPS [3].• Cell identification; The available coordinates of the servingbase station are associated with the mobile device.The accuracy of the locational information depends uponthe physical topology of the network.• Angle of arrival (AoA); The AoA method uses data frombase stations that have been augmented using arrays ofsmart antennas. This allows the base station to determinethe angle of incoming radio signals, making it possible tothen determine the location of a handset by triangulatingknown signal angles from at least two base stations.• Time of arrival (ToA); Position here is determined bytriangulating the time needed for a packet to be sendfrom a phone to three finely synchronised base stationsand back.• Enhanced observed time difference (E-OTD); This requireshandsets to be equipped with software that locallycomputes location. Three or more synchronised basestations transmit signal times to the mobile device, theembedded software of which calculates time differencesand therefore distance from each base station makingtriangulation possible.• Assisted global positioning system (A-GPS); Here devicesuse both GPS and a terrestrial cellular network to obtaingeographic positioning.C. Spectrum Signal Strength CollectionTo collect the cumulative electromagnetic energy in thefrequency range of client-side mobile phone transmissions,one must be able to measure and quantify the energy in thespecific energy band occupied by client-side mobile phonetransmissions. This is effectively measuring the signal strengthin a specific frequency band of energy [11], a commontechnique in wireless communications. To do this reliably anenergy detecting device is used which returns a received signalstrength indication (RSSI) parameter. Energy detecting devicescan easily be purchased or built. Due to such readiness inavailability, RSSI has been considered in the past as a sensingparameter. A number of applications have provided insight intoits usefulness, both Wu et al. [27] and Stoyanova et al. [28], inparticular, describe the key issues which affect RSSI accuracy.They are summarised as:• The orientation of the antenna;• Transceiver variation;• Multipath fading and changes in environment.Multipath fading and environment changes contribute themain variance in RSSI data. This relates to part of theelectromagnetic energy radiated by the antenna of a transmitterreaching a receiver by propagating through different paths.Along these paths, interactions known as propagation mechanismsmay occur between the electromagnetic field and variousobjects. To model theses mechanisms, propagation predictionmodels have been devised to provide an accurate estimate ofthe mean received power or path loss (PL) for a specifiedfrequency band based on geographical information about theenvironment. Empirical, semi-deterministic, and deterministicmodels are the main classes which describe mobile channelcharacteristics [29]. As these propagation models describe howa signal may act in a given environment, they must be usedwhen trying to gain insight into positions of signal sources.In recent years cognitive radio systems [30], [31], [32] havebecome increasingly viable and signal strength measurementis a key element in the detection of primary user spectral occupancy.To improve performance, they have explored a numberof techniques that can be used to address these issues, suchas collaborative sensing between multiple RSSI detectors [33],[34]. By cross-correlation and signal processing, non-randomsignals can be detected and analysed. Similar approaches canbe applied with existing transmissions to detect usage andextract statistical patterns.III. ProposalOur proposal is based on the measurement of localisedcumulative strength of mobile device emissions through theuse of an RSSI sensor. We propose that this data can providea suitable alternative to operator obtained data. Results willdemonstrate the proposed method can capture mobile phoneactivity and display the spacio-temporal patterns containedwithin.As an alternative sensing parameter, cumulative receivedsignal strength (RSSI) offers several advantages over networkusage data;• RSSI data can be collected without the cooperation ofmobile operators or mobile device user.• RSSI as a metric is independent of modulation type, soRSSI can be used for GSM protocols and 3G protocols.• Geo-spatial RSSI data can provide fine resolution makingit possible to localise events very accurately and quickly.• RSSI collection hardware can easily be modified toobserve different metrics, making a network deploymentvery flexible.77


However, individual sensor measurements of wideband signalstrength measurements have limitations in terms of localisedaccuracy. This is due to limiting channel characteristicsand the inability to distinguish between a single near devicetransmitting with high power and several users far away transmittingwith low power. The question then is how to reliablycollect this information taking into account such factors.By adopting techniques commonly utilised in cognitiveradio systems, we propose that these accuracy issues may bemitigated. First, by spatially and temporally weighting eachRSSI data point form a sensor with corresponding pointsfrom other radios in the geographical area nearby, the RSSIaccuracy can be improved [33] [34]. Second, modelling theenvironment with accurate models will help quantify the dataand give insight into its behaviour. Third, calibration withrespect to base station coverage will reduce effects caused bymobile device transmission power variation. Finally, the spatialsampling topology of the sensor network will be a dominantfactor in determining performance, particularly when variablesensor heights are also considered. Thus methods for insuringtopology uniformity must be taken into account.To distinguish between the RSSI signal generated by oneuser near the sensor and several users further away we willdeploy a dense network topology. This will insure that spectralenergy readings from each sensor can be localised to somedegree. To localise such activity there are several possible solutions.One is to localise activity based on a sensor identificationtechnique, similar to the cell identification used to identify amobile device position in a cellular network. Here the sensornode with the associated highest RSSI value is deemed to bethe coordinate of the activity. This will however offer reducedspatial resolution. Thus a more advanced technique, whichcombines multi-sensor information, would be a more suitableapproach.IV. Experimental WorkA. Experimental SetupOur experiments were based on the measurement oflocalised cumulative strength of mobile device emissionsthrough the use of a custom-made RSSI sensor. The main componentused to measure the RSSI intensity was a true powerdetector from Analog Devices (chip part number AD8362)paired with a single omni-directional GSM 900 antenna. TheAD8362 device returns a voltage which linearly correspondsto the RF spectrum power passed through it. It operates with a65dB dynamic range, ranging from -55dB to 10dB. To obtaina measure of the performance, experiments were carried outwithin a building on NUI Maynooths North Campus. Themeasured performance of two such sensors were comparedto that of a spectrum analyser, the results of which can befound in Section V.Doyle et al. [9] described the capabilities of such a sensorwith respect to picking up different types of phone activity.This paper highlighted the capability of such sensors for pickingup even shorter bursts of mobile transmission energy withboth text message and phone call activity clearly identified. Atechnique for the extraction of areas of high temporal denseactivity was also demonstrated. From this information, areasaround each hour mark of high temporal density were highlighted,these times coincided with the starting and finishingtimes of lectures, thus demonstrating that RSSI can providethe information needed to monitor human behaviour.To further validate the capabilities of the sensing devicesand feature extraction methodology, we designed two experimentswhich tested different scenarios of mobile phone activity.The focus was to test our method for geo-spatial temporalweighted signal processing. Both experiments took place inthe foyer of the Engineering building at NUI Maynooth undercontrolled conditions (no other phone activity). The result canbe seen in Section V.• Experiment 1: Obtain RSSI measurements from a phonecall while a person is walking in a uniform direction. Thepath taken is depicted in Fig. 1a.• Experiment 2: Measure readings from a phone call whilea person is walking in a non-uniform direction. The pathtaken is depicted in Fig. 1b.BPath walked while making phone callPosition Of Sensors(a) Path taken in Experiment 1AB(b) Path taken in Experiment 2Fig. 1: Layout of sensors and path walked by a phone userfor a controlled test carried out in Engineering foyer on NUIMaynooth’s North Campus. A and B indicate the positions ofsensors A and B respectively.B. Processing MethodVarious signal processing algorithms can be applied to assistwith extracting interesting patterns from measured mobilephone signal strengths. Our approach has focused on a geospactialtemporal based scheme that identifies time periodswith interesting behaviour. One early implementation is explainedin this Section. Its layout is depicted in Fig. 2.The spectral energy, which was sampled at a rate of 2kHz,and is denoted as s(k). The signal processing method appliedto these samples consists of four stages.• Stage 1: Detect the presence of a mobile transmission asgoverned by a cut-off threshold τ⎧⎪⎨ 0 if s(k) < τs τ (k) =(1)⎪⎩ s(k) if s(k) ≥ τA78


s(k)s (k)s b (k)s f (k)Geo-SpacialTemporalWeightings G (i,j,k)Fig. 2: Signal processing performed on raw RSSI data. Feeding output back into the geo-spactial temporal weighting stagegives an n th order weighting.where τ in this instance is chosen to be -55dBm, theminimum detectable level of the energy detecting chipset.• Stage 2: Downsample the data by a factor of T, this isdone by replacing every block of T samples by its averages b (i) = 1 TiT∑k=(i−1)T+1s τ (k) (2)where s b (i) is the downsampled data set and T is thedownsampling factor. Decimation should be applicationspecific. While it can trim down the noise within thedata, excessive decimation may reduce the signal of shorttemporal events, such as text messages.• Stage 3: Smooth the data using a moving average filter(MAF) of width (2W + 1) sampless f (i) =12W + 1∑i+Wp=(i−W)where s f (i) is the resulting filtered data set.s b (p) (3)• Stage 4: Given a vector of readings from a set of nsensorss(k) = [s 1 (k), s 2 (k), s 3 (k),...,s n (k)] (4)apply a geo-spatial temporal weighting using a truncatedGaussian Kernel. Here, s i (k) the sensor reading fromthe i’th sensor, has an associated coordinate in space(x i , y i ) relating to the position of the sensor. To achievethis weighting, points are calculated in space-time bya collaborative weighting of readings taken from eachsensor node. A point in space-time s G (x, y, k) can becalculated using,s G (x, y, k) =∑k+ jp=k− j i=1n∑g ip (x, y, k)s i (p) (5)where g ip (x, y, k) is the geo-spatial temporal weight correspondingto reading s i (p) and 2 j + 1 is the width ofthe truncating window in time. The weight g ip (x, y, k) isgiven bywhereg ip (x, y, k) = g β (x, x i )g β (y, y i )g β (k, p) (6)g β (u, v) = e⎛−⎜⎝u − vσ u β⎞2⎟⎠. (7)Here, u and v are placeholders for the correspondingvariables in Eq. 6. σ u denotes the initial spreading factorassigned to each dimension and β is a scaling factorcontrolling the spread given to those points whose weightis over the lower limiting threshold γ such that,⎧⎪⎨ 1 if s i (k) 1. The effect of this stage is to weight eachRSSI data point from a sensor with corresponding spatialand temporal points from other sources such that readingsthat are both spatially and temporally close are amplified.V. ResultsThe results shown here reflect measurements of wide bandmobile phone RSSI taken on NUI Maynooth North Campus.Fig. 4 illustrates the sensitivity comparison between aspectrum analyser and RSSI sensors, whose architecture isdescribed in Section IV-A. It can be seen that the readingsfrom RSSI sensors, though less precise, resemble that from aspectrum analyser.Fig. 3 presents the measurements collected in an experimentprior to geo-spatial temporal weighting. The experiments arecarried out to verify the ability of signal processing algorithmto highlight the movement of mobile devices in an indoorenvironment. Fig. 5 and Fig. 6 show how the geo-spatiallytemporally weighted points in space may be visualised in theform of contour maps that highlight device activity picked up.The temporal shift of energy can clearly be observed as thepositions of the phone calls, in voice communication mode,vary in space. Currently, a preliminary method is employedto interpolate the data over space. This consisted of adoptingthe sensor nodes positions as the centre of energy annealingthe signal as we moved further out. Note weights representedin each contour plot are relative measures compared to thatof surrounding areas. As a result the measure of dominanceshould be considered relative and not as an absolute value.Future work will involve more advanced methods whichmay take into account pre-defined information gathered fromgeographical information systems (GIS) and channel modelsrelating to the mobile spectrum band of interest. Nevertheless,79


Position Of Test Phone CallPosition Of Sensors(a)(b)(c)(d)Fig. 4: RSSI measurements obtained in the foyer of NUI Maynooth’s Electronic Engineering building showing the relationshipbetween the sensor nodes used and a spectrum analyser: (a) locations of calls made in the foyer, the positions of the sensingnodes and spectrum analyser; (b) readings taken from a spectrum analyser; (c) readings taken from sensor A; (d) readingstaken from sensor BSensor AExperiment 1 Experiment 2Sample No.(a) RSSI measurements obtained from sensor ASensor BSample No.(b) RSSI measurements obtained from sensor BFig. 3: RSSI measurements obtained in the Engineering foyerof NUI Maynooth for both experiments 1 and 2.these early results suggest that localised cumulative RSSI datacould be a valuable source of information when trying toextract flow information from mobile devices.VI. ConclusionsThis paper summarised the work being carried out in thearea of mapping mobile phone activity on urban and localisedscales. At the same time, an overview is presented on popularlocalised tracking techniques and issues which relate to thereliable measurement of mobile spectrum RSSI. Experimentsdemonstrate that the detection of mobile spectrum RSSI canprovide useful information when monitoring mobile deviceactivity in a localised context. This information is gatheredwithout the cooperation of mobile network operators or usersand retains usage anonymity due to the lack of informationdecoding. We presented a preliminary technique for the detectionand visualisation of mobile activity flow within indoorenvironments.This proposed approach could also be used to complementtraditional techniques for mapping mobile device activity. Forinstance, one could use the network operator data, if available,to model the dynamics of a city or town while localised RSSIdata, within such an urban environment, is used to observe thedynamics of specific buildings or localised areas. Nonetheless,our research is still in its preliminary stages, so additionalvalidation is needed.For this purpose, a mobile sensor network aimed at thecollection of RSSI data is under construction. It will first bedistributed throughout the North Campus of NUI Maynoothwith a view to expanding it into the nearby South Campusand town of Maynooth in longer term. This project will offeran opportunity to understand some of the dynamics relatingto university student life. Moreover, focusing on temporal andspatial patterns of mobile phone activity may shed light onhow we interact with our local environment.We hope to address such questions as how buildings reallyused on campus, how to determine where people can be foundas opposed to where they pass through and how to identifyinteresting localised events as they occur in time and space.The answers to these questions would pave the way for anumber of interesting applications. A real time map of human80


BAB4123A(b) Sampled at 12.5s (c) Sampled at 18.75s(b) Sampled at 50s (c) Sampled at 62.5s(d) Sampled at 25s (e) Sampled at 31.25s(d) Sampled at 75s (e) Sampled at 87.5sFig. 5: Mapping of RSSI information obtained after the geospatialtemporal weighting process for time slot of experiment1inFig.3atdifferent sampling times.flow could be produced showing the real time movements ofstudent population, both indoor and outdoor. The map couldprovide insights to university planning authorities to decide onthe location of student services or emergency services in theevent where rapid response is required.AcknowledgementsResearch presented in this paper was funded by a StrategicResearch Cluster grant (07/SRC/I1168) by Science FoundationIreland under the National Development Plan and by the Irishresearch Council for Science, Engineering and Technologyunder their Embark Initiative in partnership with ESRI Ireland.The authors gratefully acknowledge this support.References[1] I. C. R. (COMREG), “Quarterly report,” Tech. Rep., March <strong>2009</strong>.[2] I. C. S. O. (CSO), “Population and migration estimates,” Tech. Rep.,April 2008.[3] C. Ratti, S. Williams, D. Frenchman, and R. Pulselli, “Mobile landscapes:using location data from cell phones for urban analysis,”Environment and Planning B: Planning and Design, vol. 33, no. 5, pp.727–748, 2006.[4] Y. Ivanov, C. Stauffer, A. Bobick, and W. Grimson, “Video surveillanceof interactions,” IEEE Workshop on Visual Surveillance, vol. 19, no. 12,pp. 82–89, 1999.(f) Sampled at 100sFig. 6: Mapping of RSSI information obtained after the geospatialtemporal weighting process for time slot of experiment2 in Fig. 3 at different sampling times. The path numbersindicate the walking sequence of the phone user.[5] D. Ayers and M. Shah., “Monitoring human behaviour from video takenin an office environment,” Image and Vision Computing, vol. 19, no. 12,pp. 833–846, 2001.[6] A. Davies, J. Yin, and S. Velastin, “Crowd monitoring using image processing,”Electronics And Communication Engineering Journal, vol. 7,no. 1, pp. 37–47, 1995.[7] K. Bowyer, “Face recognition technology: security versus privacy,” IEEETechnology and Society Magazine, vol. 23, no. 1, pp. 9–19, 2004.[8] L. Barkuus and A. Dey, “Location-based services for mobile telephony:a study of user’s privacy concerns,” <strong>Proceedings</strong> of the INTERACT2003, 9TH IFIP TC13 International Conference on Human-ComputerInteraction, pp. 9–19, July 2003.[9] J. Doyle, R. Farrell, S. McLoone, T. McCarthy, P. Hung, and M. Tahir,“Utilising Mobile Phone RSSI Metric for Human Activity Detection,”<strong>Proceedings</strong> of the ISSC <strong>2009</strong>, 20th Irish Signals and Systems Conference,June <strong>2009</strong>.[10] J. Reades, F. Calabrese, A. Sevtsuk, and C. Ratti, “Cellular census:Explorations in urban data collection,” IEEE Pervasive Computing,81


vol. 6, no. 3, pp. 30–38, 2007.[11] H. Urkowitz, “Energy detection of unknown,” <strong>Proceedings</strong> of the IEEE,vol. 55, no. 4, pp. 30–38, April 1967.[12] K. Srinivasan and P. Levis, “Rssi is under appreciated,” in <strong>Proceedings</strong>of the Third Workshop on Embedded Networked Sensors (EmNets 2006),2006.[13] E. Polak, “Amsterdam real time,” Waag Society,http://www.waag.org/realtime/, 2002.[14] E. ONeill, V. Kostakos, T. Kindberg, A. Penn, and S. F. T. Jones, “Instrumentingthe city: Developing methods for observing and understandingthe digital cityscape,” Lecture Notes in Computer Science, vol. 4206,pp. 315–332, Sept 2006.[15] C. Ratti, A. Sevtsuk, S. Huang, and R. Pailer, “Mobile landscapes:Graz in real time,” in <strong>Proceedings</strong> of the 3rd Symposium on LBS AndTeleCartography. Springer, 2005, pp. 28–30.[16] F. Calaberse and C. Ratti, “Real time rome,” Networks and CommunicationStudies, vol. 20, no. 3-4, pp. 247–257, 2006.[17] T. Horanont and R. Shibasaki, “An implementation of mobile sensingfor large-scale urban monitoring,” <strong>Proceedings</strong> of UrbanSense08, InternationalWorkshop on Urban, Community, and Social Applications ofNetworked Sensing Systems, Nov 2008.[18] R. Ahas, A. Aasa, lar Mark, T. Pae, and A. Kull, “Seasonal tourismspaces in estonia: Case study with mobile positioning data,” TourismManagement, vol. 28, no. 3, pp. 898–910, 2007.[19] N. Eagle, A. Pentland, and D. Lazer, “Inferring social network structureusing mobile phone data,” Proc. of National Academy of Sciences, 2006.[20] P. Bahl and V. Padmanabhan, “Radar: An in-building rf-based userlocation and tracking system,” in IEEE INFOCOM, vol. 2. Institute ofElectrical Engineers Inc (IEEE), 2000, pp. 775–784.[21] E. Elnahrawy, X. Li, and R. Martin, “The limits of localization usingsignal strength: A comparative study,” in In <strong>Proceedings</strong> of the 1st IEEEInternational Conference on Sensor and Ad Hoc Communications andNetworks. Institute of Electrical Engineers Inc (IEEE), Oct 2004, pp.406–414.[22] A. Ladd, K. Bekris, A. Rudys, L. Kavraki, and D. Wallach, “Roboticsbasedlocation sensing using wireless ethernet,” Wireless Networks,vol. 11, no. 1, pp. 189–204, 2005.[23] L. Aalto, N. Göthlin, J. Korhonen, and T. Ojala, “Bluetooth and wappush based location-aware mobile advertising system,” in <strong>Proceedings</strong>of the 2nd international conference on Mobile systems, applications, andservices. ACM New York, NY, USA, 2004, pp. 49–58.[24] N. Priyantha, A. Chakraborty, and H. Balakrishnan, “The cricketlocation-support system,” in <strong>Proceedings</strong> of the 6th annual internationalconference on Mobile computing and networking. ACM New York, NY,USA, 2000, pp. 32–43.[25] A. Ward, A. Jones, and A. Hopper, “A new location technique for theactive office,” IEEE Personal Communications, vol. 4, no. 5, pp. 42–47,Oct 1997.[26] V. Otsason, A. Varshavsky, A. LaMarca, and E. D. Lara, “Accurate gsmindoor localization,” Lecture Notes in Computer Science, vol. 3660, pp.141–158, Aug 2005.[27] R. Wu, Y. Lee, H. Tseng, Y. Jan, and M. Chuang, “Study of characteristicsof rssi signal,” in IEEE International Conference on IndustrialTechnology, 2008. ICIT 2008. Institute of Electrical Engineers Inc(IEEE), 2008, pp. 1–3.[28] T. Stoyanova, F. Kerasiotis, A. Prayati, and G. Papadopoulos, “Evaluationof impact factors on rss accuracy for localization and trackingapplications,” in <strong>Proceedings</strong> of the 5th ACM international workshopon Mobility management and wireless access. ACM New York, NY,USA, 2007, pp. 9–16.[29] B. Fleury and P. Leuthold, “Radiowave propagation in mobile communications:an overview of european research,” Communications Magazine,IEEE, vol. 34, no. 2, pp. 70–81, Feb 1996.[30] S. Haykin, “Cognitive radio: brain-empowered wireless communications,”IEEE Journal on Selected Areas in Communications, vol. 23,no. 2, pp. 201–220, 2005.[31] J. Mitola III and G. Maguire Jr, “Cognitive radio: making software radiosmore personal,” IEEE Personal Communications, vol. 6, no. 4, pp. 13–18, 1999.[32] W. L. I.F. Akyildiz, M. Vuran, and S. Mohanty, “Next generation/dynamicspectrum access/cognitive radio wireless networks: Asurvey,” Computer Networks, vol. 50, no. 13, pp. 2127–2159, 2006.[33] X. Huang, N. Han, G. Zheng, S. Sohn, and J. Kim, “Weightedcollaborativespectrum sensing in cognitive radio,” in Communicationsand Networking in China, 2007, 2007, pp. 110–114.[34] A. Ghasemi and E. Sousa, “Collaborative spectrum sensing for opportunisticaccess in fading environments,” in New Frontiers in DynamicSpectrum Access Networks, 2005. DySPAN 2005., 2005, pp. 131–136.82


Evaluating Twitter for Use in EnvironmentalAwareness CampaignsPeter Mooney ∗† , Adam C. Winstanley ∗ and Padraig Corcoran ∗∗ Geotechnology Research Group,Department of Computer Science,National University of Ireland Maynooth (NUIM),Maynooth, Co. Kildare. IrelandEmail: {peter.mooney, adam.winstanley, padraig.corcoran}@nuim.ie† Environmental Research Centre,Environmental Protection Agency Ireland,Richview, Clonskeagh,Abstract—Many studies have shown that the effective harnessingof ICTs is critical in local, national, and global efforts toadapt and mitigate the effects of climate change. Citizens must beprovided with accurate information about environmental issuesand should receive this through the most effective communicationchannels available. In this paper we describe work in progressin evaluating Twitter as a means of distributing environmentalinformation to citizens. This work will attempt to measure howeffective the Twitter medium can be in environmental awarenesscampaigns for issues such as climate change by carrying out ananalysis of a regularly updated database of Twitter messages.This work will also look to establish if users are environmentalissues through their Twitter networks.I. INTRODUCTIONThe online social network Twitter.com (http://twitter.com/)and environmental issues such as climate change and pollutionare both inextricably linked with today’s popular culture andmass media. In this introductory section we provide a briefoverview of both Twitter and the natural environment in orderto emphasise their individual positions in modern society,popular culture, and the mass media. We also provide a briefliterature review of research carried out on the applications ofTwitter and its social impact.A. Influence of TwitterTwitter is used by millions of people around the world tostay connected to their friends, family members and coworkersthrough their computers and mobile phones. The interfaceallows users to post short messages (up to 140 characters)that can be read by any other Twitter user. Users declare thepeople they are interested in following, in which case they getnotified when that person has posted a new message. A userwho is being followed by another user does not necessarilyhave to reciprocate by following them back, which makesthe links of the Twitter social network directed. Zeichicksummarises Twitter as the ability to “post and follow textmessaging using a browser, special desktop applications, ormobile applications on smartphones” [1]. He comments thatwith key news media, such as the New York Times andthe Telegraph, writing frequently about Twitter we have anindication that “it (Twitter) has passed the stage where itis only for early adopters”. Some newspapers have begunto publish getting started guides for Twitter such as “Howto make the most of Twitter” from The Guardian [2]. Theubiquitous nature of Twitter is reflected in media reportsin the UK from early <strong>2009</strong> where proposed primary schoolcurriculum changes were discussed. These changes wouldallow schools greater flexibility in what they teach includingplans to teach school children the fundamentals of using Web2.0 technologies such as Twitter and Wikipedia [3].B. Environmental ProblemsSince the end of 2006 climate change has gradually becomethe hot topic [4] amongst all other environment problems. Alarge number of events, reports, movies, etc. for example theStern Review, Fourth Assessment Report of the IntergovernmentalPanel on Climate Change [5], Conference of Parties13 in Bali, the Live Earth Global Concert have generatedincreasing media coverage on climate change issues [4]. Conventionalenvironmental awareness campaigns strongly relyon information to change attitudes. To make climate changecommunication effective, more sophisticated alternatives aresuggested, such as harnessing tools and concepts used bybrand advertisers, so as to make being climate-friendly desirablerather than a duty or matter of obedience [6]. Climatechange affects every citizen at every level - local, regional,national, and global [7]. Some authors [8] have looked atthe reporting of climate change in the mass media. Resultsof this analysis[8] show that scientists tend to be associatedwith an emphasis on environmental problems and causes whilepoliticians and special interests tend to be associated solutionsand remedies. A major European Commission survey revealedthat “pollution in towns and cities and climate change” are themost frequently discussed environmental topics amongst EUCitizens and this reflects the intense public discussion on thesetopics[9]. Almost 57% of EU Citizens surveyed placed climatechange as the number one issue that “worried them about the83


environment’. The report states that “this further reinforcesthe observation that climate change has become one of thetop concerns in the environmental debate”.C. Users and Usage Patterns on TwitterAs we will discuss in Section II below Twitter.com providesan API to access the Twitter service. This has assistedresearchers in carrying out research on various aspects ofTwitter. Many novel applications have been developed. Twitterhas allowed professionals in the area of health-care simulations[10] and education to begin “open sharing of relevantand useful knowledge allowing the community to adapt andevolve faster to the rapidly changing health care environment”.The social possibilities of mobile technology in transitionalspaces such as public transport has been investigated whereresearchers designed a location-based friend finder for Twitterthat displays only people in the same train as the user inthe Stockholm subway [11]. Java et. al present a taxonomycharacterising the underlying intentions users have in makingTwitter posts by aggregating the apparent intentions of usersextracted from Twitter posting data [12]. This analysis showsthat Twitter users with similar intentions connect with eachother successfully and find each other amongst the manyother millions of users. Other work [13] has gathered Twitterposts from nearly 100, 000 users using deep searches ofthe Twitter network sampled collections from the publiclyavailable timeline. The authors identified three distinct classesof Twitter users. Firstly there are users who have a much largernumber of followers than they are following themselves. Theseinclude media outlets, Hollywood stars, etc. The second groupcalled acquaintances are users exhibiting a certain symmetryin their Twitter relationships - that is they follow people whofollow them. The final group is a small group who havethe common characteristics that they are following a muchlarger number of people than they have followers. These areusually people who contact everyone in the hope that theywill get a high following. Other research has investigated ifTwitter operates a form of online “word of mouth branding”[14]. The authors analysed almost 150, 000 Tweets containingbranding comments, sentiments, and opinions. Of the 20% ofthese tweets found to contain branding comments almost 50%contained positive sentiments about certain brands. Twitterwas seen to have had an influential role in the successfulpresidential campaign of Barack Obama. “On election day,the Obama campaign used Twitter to post toll-free numbersand texting strings for finding polling locations, connecting tovolunteer opportunities, and making contributions” [15].II. USING THE TWITTER APITwitter.com provides a REST API (REpresentational StateTransfer Application Programming Interface) which allowsdevelopers to perform most tasks that users might otherwiseperform with their Twitter account using the forms on theTwitter website. With the API developers can retrieve the last20 tweets of the accounts the authenticated user is subscribedto, of all unprotected users, or of a specific user. The APIprovides a means for the programmatic sending and deletingof tweets, direct messaging, friendships, notifications, accountblocking, favorite messages, etc. The REST API is relativelyeasy to use from any programming language that can performand handle HTTP GET actions for sending URLs to a server.For most popular programming languages developers can finda Twitter API library allowing the sending and receiving oftweets and performing of other Twitter-related search andquery information using the syntax and data structures ofthe specific programming language. There are libraries forthe Twitter API in Java, PHP, C++, Ruby, .NET, and PERL.To retrieve information (tweets, searches, user lookups, etc)from Twitter one can easily use command line tools suchas wget or curl to send a specially formatted URL tothe Twitter server and receive information back in plain textformat (JSON or XML). To make use of this information itmust be parsed. This is where the Twitter API Libraries forthe programming languages mentioned above becomes veryuseful. By querying the Twitter system using the Twitter APIwithin programming code the returned information can beparsed, analysed, searched, stored in a database, etc. It alsoprovides developers with an opportunity to build in Twitterfunctionality to existing web-based applications.A. Web-based Applications using the Twitter APIGiven the simplicity of the the Twitter API several webbasedapplications have gained quick popularity on providingvalue-added services for Twitter users. Twuffer http://www.twuffer.com allows users to schedule tweets for a later date.A tweet is typed into twuffer with a specified date and timefor broadcast. At the specified date and time twuffer poststhe tweet on Twitter. Twidentify http://www.twidentify.comis a search engine for Twitter. There are 3 ways to searchusing a keyword. Trend search allows tracking the popularityof a keyword over time. The second is a basic Twitter searchof who is using the keyword in their current tweets. Thefinal search option is Search on influence. The results of thissearch are sorted in order of users who are retweeted (directlyquoted in other tweets or conversations). This usually givesthe opportunity to see what influential people on twitter aretweeting about your keyword search term. TwitterCounterhttp://twittercounter.com/ is a user statistics application allowingusers to track their progress on twitter. The informationis presented in time series graphs and allows customisation ofthe timeframe. Other functions includes the ability to compareyour statistics to other users.B. Offline analysis of Twitter messagesTo supply data for this research it is necessary to build up alarge corpus of Tweets. We downloaded and stored all tweetswhich contained revelvant keywords: climate, environment,climate change, etc. This corpus of Tweets is stored offline dueto restrictions placed on the Twitter API in terms of numberof server accesses per day. Figure 1 shows a flowchart of theprocess of accessing Twitter messages. A PHP script sendsthe appropriate HTTP GET (keyword search, user search,84


API provides geocoding based on a user’s location fromtheir Twitter profile. A circle of N kilometers is searchedcentered on a (lat, long) pair.• Analysis of the number of retweets for climate changerelated issues. A retweet is when one individual copies atweet from someone in their network and shares it withtheir network. It is acknowledged as the highest degreeof content approval on Twitter.III. CONCLUSIONS AND FUTURE DIRECTIONIn this short paper we have described research we arecurrently undertaking to establish how effective Twitter couldbe as a tool in environmental awareness campaigns. We lookedat the specific issue of climate change to establish if peopleusing Twitter were communicating about environmental issuesin their Twitter networks.Fig. 1.Download of messages from Twitter.com using the Twitter APIFig. 2.Entity Relationship Diagram for Message Databasetrend request, public timeline) request to the Twitter server.The Twitter server replies with XML formatted output. Eachtweet object contains the message text, information aboutthe author, and the timestamp of the tweet. This XML isprocessed by the PHP script. The PHP script parses each tweetmessage which includes reformatting of special charactersand extraction of usernames within messages and inserts thetweet into a database. The Entity Relationship diagram forthe message database is shown in Figure 2. The Twitter APIrestricts calling applications to 70 requests per hour unlessotherwise arranged with Twitter.com. For this reason we cachethe results of public timeline requests and keyword searches.Each tweet has a unique alpha-numeric identifier. This helps toavoid duplication of tweets within our offline database. Textualanalysis of this dynamically updated database of Twitter postsis then performed. We are currently investigating a number ofissues which are summarised as follows:• Investigation of temporal correlations between the useof climate change related vocabulary during periods ofmajor media coverage of climate change issues and events• Specific analysis of Twitter messages from Ireland containingclimate change related vocabulary. The TwitterA. Future Direction of TwitterHoney et. al [16] predicts that tools such as Twitter “willsoon come to be used in formal collaborative contexts, aswell for example in work involving distributed teams just asinstant messaging was used before”. Other researchers [17]comment that as the new generation of scientists “grow up”with instant messaging, blogging, Twitter, they are beginningto explore ways to use these technologies for informationexchange and collaboration. The future direction of Twitteris somewhat unknown with Lucky [17] stating that “we arein the middle of something happening around us and nobodyreally understands the consequences”. Twitter has come “fromnowhere to become the third most visited social networkingsite in the US in just three years by allowing its users tobroadcast their thoughts, actions and news instantly” [18]. Thisrise has caused Google to “admit to losing out to Twitter inthe race to meet web user demand for real-time information”.Interestingly young adults and teenagers have not taken Twitterseriously yet according to Internet surveys such as Nielsen NetSurvey [19]. Some research indicates that as long as teenagerscan update their online status via MySpace and Facebook fortheir friends as well as Instant Messaging and SMS Texts,Twitter doesn’t really add to the existing technology. Manyyoung adults are only seeing the media and business aspectsof Twitter.B. Using Twitter SecurelyWe believe that as Twitter becomes more widely adopted bycitizens the issues of information and personal security usingTwitter will need to be addressed. This was also the case whenemail became ubiquitous [20][21]. Some literature has begunto appear regarding the security of Twitter. Some researchshows that many users, often willingly, “share personal identifyinginformation about themselves but do not have a clearidea of who accesses their private information or what portionof it really needs to be accessed” [22]. For those organisationswho have “a business need to use Twitter then there must betraining provided on how to use Twitter in a secure manner”[23]. The authors emphasise the need to “provide ongoing85


awareness communications about Twitter information securityand privacy issues”. Vulnerabilities in Twitter’s Javascript programmingcode leaves the “microblogging service with majorholes in its security”[24]. This is expanded upon by Bradburywho gives examples of how worms and other malicious codecould be transported around the Internet by the exchangeof links within Twitter messages[25]. For Twitter to gainacceptance as a communication device for serious issues suchas Climate Change the authors feels that it is necessary that theproblems of the email world: spam, junk mailing, phishing, etcare tackled aggressively and effectively. Otherwise users willfollow the same usage patterns as they use when managingtheir email - only trusting a small set of users, or friends, anddeleting any material which looks dubious.C. Public Awareness of Climate ChangePublic awareness is key to making a real difference infighting environmental problems such as climate change [4].However, due to ineffective communication strategies, mucheffort to educate the public on climate change issues hasnot translated into a great degree of concrete progress. Asoutlined in [26] the authors show that the experiences of theUK, Canada and Sweden demonstrate that climate changecommunication campaigns appear to influence large numbersof people in relatively short periods of time. These campaignswere based on pro-social behavioral campaigns but had littlesuccess in changing peoples habits and behaviours. Harnessingthe pro-social aspects of Twitter could prove a useful toolin informing the public better about environmental problems.“This is low-hanging fruit in the fight against climate changethat our society really cant afford not to harvest” [26]. Webelieve that Twitter can assist in communicating informationabout Climate Change. Tools such as Twitter can address “thedichotomy of high awareness and low priority strongly relatedto ineffectiveness of some environmental communications”[4].ACKNOWLEDGMENTDr. Peter Mooney leds a 5 year project (2008 - 2013) called“Geoinformatics Services for Improved Access to EnvironmentalData and Information” funded by the EnvironmentalProtection Agency STRIVE research programme. Dr. AdamWinstanley is funded under StratAG (Strategic Research inAdvanced Geotechnologies and is the Principal Investigatorfor the Location Based Services (LBS) Strand . Dr. PadraigCorcoran is a Lecturer at the Department of Computer Science.REFERENCES[1] A. Zeichick, “A-twitter over twitter,” netWorker, vol. 13, no. 1, pp. 5–7,<strong>2009</strong>.[2] C. Arthur, “How to make the most of twitter,” The Guardian: Thursday8 May 2008. Technology news and features section (page 1), <strong>2009</strong>.[3] P. Curtis, “Pupils to study twitter and blogs in primary shake-up,” TheGuardian: Wednesday 25 March <strong>2009</strong>. Top Stories (page 1), <strong>2009</strong>.[4] C. K. Tan, A. Ogawa, and T. Matsumur, “Innovative climate changecommunication: Team minus 6%,” Global Environment InformationCentre (GEIC), United Nations University (UNU),53-70, Jingumae 5-chome,Shibuya-ku, Tokyo 150-8925,Japan, GEIC Working Paper Series2008-001, 2008.[5] IPPC, “Climate change 2007: Synthesis report. the fourth assess- mentreport of the intergovernmental panel on climate change,” IntergovernmentalPanel on Climate Change, IPCC, Geneva, Switzerland, IPPCSynthesis Report ISBN 978 0521 88009-1, 2006.[6] G. Ereaut and N. Segnit, “Warm words: How are we telling the climatestory and can we tell it better?” Institute for Public Policy Research,IPPR,30 - 32 Southampton Street,Covent Garden, London, WC2E 7RA,England., IPPR Technical Report 435, 2006.[7] J. Laukkonen, P. K. Blanco, J. Lenhart, M. Keiner, B. Cavric, andC. Kinuthia-Njenga, “Combining climate change adaptation and mitigationmeasures at the local level.” Habitat International, vol. 33, no. 3,pp. 287 – 292, <strong>2009</strong>.[8] C. Trumbo, “Constructing climate change: claims and frames in US newscoverage of an environmental issue,” Public Understanding of Science,vol. 5, no. 3, pp. 269–283, 1996.[9] European Commision, “Attitudes of european citizens towards theenvironment,” Directorate General Environment and coordinated byDirectorate General Communication., Download at http://ec.europa.eu/public opinion/archives/eb special en.htm, Special Eurobarometer SurveyReport Ref: 295 Wave EB68.2, 2008.[10] D. Weberg, “Twitter and simulation: Tweet your way to better sim,”Clinical Simulation in Nursing, vol. 5, no. 2, pp. e63 – e65, <strong>2009</strong>.[11] N. Belloni, L. E. Holmquist, and J. Tholander, “See you on the subway:exploring mobile social software,” in CHI EA ’09: <strong>Proceedings</strong> of the27th international conference on Human factors in computing systems.New York, NY, USA: ACM, <strong>2009</strong>, pp. 4543–4548.[12] A. Java, X. Song, T. Finin, and B. Tseng, “Why we twitter: An analysisof a microblogging community.” in Advances in Web Mining and WebUsage Analysis, ser. Lecture Notes in Computer Science, H. Zhang andM. Spiliopoulou, Eds., vol. 5439. Springer, 2007, pp. 118–138.[13] B. Krishnamurthy, P. Gill, and M. Arlitt, “A few chirps about twitter,” inWOSP ’08: <strong>Proceedings</strong> of the first workshop on Online social networks.New York, NY, USA: ACM, 2008, pp. 19–24.[14] B. J. Jansen, M. Zhang, K. Sobel, and A. Chowdury, “Micro-bloggingas online word of mouth branding,” in CHI EA ’09: <strong>Proceedings</strong> of the27th international conference on Human factors in computing systems.New York, NY, USA: ACM, <strong>2009</strong>, pp. 3859–3864.[15] S. Greengard, “The first internet president,” Commun. ACM, vol. 52,no. 2, pp. 16–18, <strong>2009</strong>.[16] C. Honey and S. Herring, “Beyond microblogging: Conversation andcollaboration via twitter,” in System Sciences, <strong>2009</strong>. HICSS ’09. 42ndHawaii International Conference on, Jan. <strong>2009</strong>, pp. 1–10.[17] C. R. Aragon, S. Poon, and C. T. Silva, “The changing face of digitalscience: new practices in scientific collaborations,” in CHI EA ’09:<strong>Proceedings</strong> of the 27th international conference on Human factors incomputing systems. New York, NY, USA: ACM, <strong>2009</strong>, pp. 4819–4822.[18] R. Wray, “Google falling behind twitter,” The Guardian Online: Tuesday19 May <strong>2009</strong>. http://www.guardian.co.uk/business/<strong>2009</strong>/may/19/google-twitter-partnership, <strong>2009</strong>.[19] M. McGiboney, “Twitters tweet smell of success,” Nielsen NetView Survey.<strong>2009</strong> (Feburary) Available at http://blog.nielsen.com/nielsenwire/online mobile/twitters-tweet-smell-of-success/, <strong>2009</strong>.[20] A. Levi and c. K. Koç, “Inside risks: Risks in email security,” Commun.ACM, vol. 44, no. 8, p. 112, 2001.[21] P. Kumaraguru, Y. Rhee, A. Acquisti, L. F. Cranor, J. Hong, andE. Nunge, “Protecting people from phishing: the design and evaluationof an embedded training email system,” in CHI ’07: <strong>Proceedings</strong> ofthe SIGCHI conference on Human factors in computing systems. NewYork, NY, USA: ACM, 2007, pp. 905–914.[22] B. Krishnamurthy and C. E. Wills, “Characterizing privacy in onlinesocial networks,” in WOSP ’08: <strong>Proceedings</strong> of the first workshop onOnline social networks. New York, NY, USA: ACM, 2008, pp. 37–42.[23] R. Power and D. Forte, “War and peace in cyberspace: Don’t twitteraway your organisation’s secrets,” Computer Fraud and Security, vol.2008, no. 8, pp. 18 – 20, 2008.[24] J. Sullivan, “Why tweeters should beware of worms,” The New Scientist,vol. 201, no. 2706, pp. 18 – 18, <strong>2009</strong>.[25] D. Bradbury, “Twitter hit by worm attacks,” Computer Fraud andSecurity, vol. <strong>2009</strong>, no. 4, pp. 1 – 1, <strong>2009</strong>.[26] K. Akerlof and E. W. Maibach, “Sermons as a climate change policytool: Do they work? evidence from the international community,” GlobalStudies Review, vol. 4, no. 3, pp. 4–7, 2008.86


Research on Unmanned Airship Low-altitude Photogrammetric FlightPlanning and Pseudo-ortho ProblemsDUAN Yi-ni1.Geomatics CollegeShandong University ofScience and TechnologyQingdao, China2.Peking UniversityBeijing, Chinaemail:yiniduan@126.comAbstractZHENG Wen-huaGeomatics CollegeShandong University ofScience and TechnologyQingdao, Chinaemail:zhenwenhua_sd@163.comYAN LeiInstitute of Remote Sensing& Geographic InformationSystemPeking UniversityBeijing, Chinaemail: lyan@pku.edu.cn2.1 Optimal Flight altitude designUnmanned airship is a new method of aerialphotogrammetry. This paper mainly focuses on airshiplow-altitude photogrammetric flight planning andparameters design. It gives a series of formulas tocalculate flight altitude, flight route and otherphotogrammetric parameters. Moreover, this paperdiscusses the solution to pseudo-ortho problems forhigh buildings. According to the relationship amongflight altitude, building height, offset from flightroute center and deformation, the function models areused to correct the deformation.Keywordsunmanned airship; low-altitude photogrammetry;flight altitude optimal design; flight route design;pseudo-ortho1. IntroductionCompared with other photogrammetric platforms,unmanned airship is less stable and is easy to beinfluenced by airflow. As a result, some images’inclination angles are too large, with various greydegrees and irregular overlap. In order to overcome thedisadvantages of unmanned airship low-altitudephotogrammetry system, it is necessary to designoptimal photogrammetric flight parameters based onthe character of airship itself. Due to large orthophotoprojection error of high buildings in the condition ofcenter projection, the bottleneck of orthophotoproduction in the paper is researched on the solution topseudo-ortho.2. Photogrammetric flight planning andparameters’ designIn ideal state (the ground is flat, the photo is level),the relationship between flight altitude and scale isshown in (1):1m =fH(1)Where, m is the denominator of photogrammetricscale, f is the focal length of camera, H is flightaltitude.A. The first method of flight altitude optimaldesignMap scale leads to photogrammetric one tocalculate flight altitude by (1). The relationshipbetween map scale and photogrammetric one [1] isshown in Tab.1. We also calculate the correspondingflight altitudes for the three map scales.Tab.1 The relationship of some referencesMapscaleSimulativephotogrammetric scaleDigitalphotogrammetric scaleFlightaltitude(m)1:500 1:2000~1:3000 1:4000 961:1000 1:4000~1:6000 1:6000~1:8000 144~1921:2000 1:8000~1:12000 1:12000 288B. The second method of flight altitude optimaldesignFlight altitude relies on the ground resolution (GR)of digital camera. The relationship between GR and H[2] is shown in (2)H ⋅ N ⋅ dGR = or H ⋅ N ⋅ dGR = (2)f ⋅ rows f ⋅ colsWhere, f is the focal length of camera, H is flightaltitude, N is pixel values in rows and columns, d issingle pixel size.With (2), the optimal flight altitude in differentscales can be designed for precision requirements.87


2.2 Flight route planningFlight route planning is based on the formula ofphotogrammetric baseline. With the overlap degree Q,the image width l (the unit is pixel) and the groundresolution GR, the baseline b [3] is:b= (1-Q) ×l ×GR (3)A. Flight route interval optimal designTab.2 shows that, for Canon 5D digital camera, l y(the lateral width of image) =2912. If the groundresolution is 0.15m and the lateral overlap degree is30%~50%, the lateral baseline length can be calculatedwith (3):Tab.2 Parameters of Canon 5D digital cameraImage array CMOS(pixel) (mm)CameraTypeLens focuslength(mm)Canon 5D 4368×2912 35.8×23.9 24b 0.3 =(l-0.30)×ly×GR=305.76 mOrb 0.5 =(l-0.50)×ly×GR=218.4 mFrom the result, it’s known that the suitable flightroute interval is about 200 meters, which can makesure that the lateral overlap degree is 30%~50%.B. Location of photographic station optimaldesignAs the flight direction is X, the location ofphotographic station n is [4] :X n =X 1 + (n-1)×b x (4)Where, b x is calculated with (3), X 1 is the locationof the first photographic station.C. Image number in a flight route optimal designn=int( L )+2 (5)bWhere, L is the length of a whole flight route, b isthe length of baseline in the flight direction.3. Research on pseudo-ortho problems ofhigh buildingsAs the type of airborne photogrammetry belongs tocenter projection, the higher are the buildings, or thefurther is the airship offset from flight route center, thegreater the projection error is. Fig. 1 shows therelationship among flight altitude H, building height h,radius vector r and projection errorδ h .Fig.1 Projection errorδhWith (6) we can calculate δ h :hδ h = r(6)HWhere, r is the radius vector from image a to platenadir point n, δ h is projection error aa 0 .From (6), we can get two solutions to pseudo-orthoproblems of building height:A.hδ h = r If flight altitude H, building heightHh and radius vector r are given, projectionerror δ h can be calculated. Input δ h , theprojection error i.e. pseudo-ortho deformationcan be rectified in ERDAS software.B.h ⋅ Hr = δ If the tolerance value of projectionherror Max δ h , flight altitude H and heightbuilding h are given, the radius vector r can becalculated. Drawing a circle with the radius r,the deformation inside the circle is less thantolerable value. The orthophoto quality fromthe special part will be higher.4. Photogrammetric flight testOn April 2008, the authors applied the unmannedairship low-altitude photogrammetry for the task of1:2000 photogrammetric topographic maps andcorresponding orthophotos of Shengjian Coal Mine inSanhekou City, Shandong Province, China. Total flightarea: 16.8 km 2 ; Ground resolution: 10.2cm; Forwardoverlap: 75%, lateral overlap: 55%.With the Canon 5D digital camera parameters inTab.2 and flight planning formulas, the flightparameters are designed as follows.A. Flight altitude optimal designW r =0.102×4368=445.5mThen the flight altitude is:88


f ⋅Wf ⋅ 445.5H = = = 299mN ⋅ d 35.8B. Flight route interval optimal designLength of lateral baseline:B 0.55 =(l-0.55)×ly×GR=0.45×2912×0.102=133.66 mFrom the result, we design that the flight altitude is300m, the flight route interval is 133 meters, and theground coverage area of every photo is 400m×300m.Fig.2 shows the aerial photos of “South 2” flight regionof the coal mine. The precision of certain orientationpoints is shown in Tab.3 after aerial triangulation.Figure 2 Flight routes and aerial photos of “South 2”flight regionThe planar point mean square error: ± 0.155m,elevation mean square error: ±0.103m.Tab.3 The coordinate differences in stereo pair and RTK (Realtime Kinematic)Test pointsRTK coordinates Coordinates measured in stereo pair Coordinate differencesX Y Z X Y Z X Y Z20001 508485.452 3856682.175 35.879 508485.369 3856682.134 35.882 0.083 0.041 -0.00320002 508328.752 3854900.606 34.262 508328.536 3854900.571 34.366 0.216 0.035 -0.10420005 508338.393 3855954.610 34.423 508338.346 3855954.430 34.560 0.047 0.180 -0.13720006 508387.974 3856335.317 36.687 508388.124 3856335.204 36.886 -0.150 0.113 -0.19920012 508097.988 3856711.690 35.254 508098.095 3856711.730 35.378 -0.107 -0.040 -0.124standard deviationof overall sample0.133 0.075 0.064The results shows that the aerial triangulationprecision satisfies the requirement of “1:500, 1:1000,1:2000 topographic maps survey norms”5. Conclusion1) Before the photogrammetric flight, design theflight parameters with the formulas in this paper, toprovide strict theory for flight planning. But in actualoperation process, ground surface irregularity, flightoff-course, image tilt and other factors must be takeninto consideration. So flight parameters should beadjusted in different flight conditions.2) To make sure that the designed flight parametersare valid, flight test can be done in a small part of thewhole survey region. By analysis of the coordinatedifferences between the measured coordinates in stereopair and the corresponded RTK coordinates, theprecision of aerial triangulation can be obtained, whichreflects the quality of airborne photogrammetry.3) This paper introduces a method with functionmodel to solve the pseudo-ortho problems. But it mustbe realized by perfecting the ERDAS software andadding the new pseudo-ortho rectification function,which will greatly improve the efficiency and qualityof orthophoto production.References[1] CUI Hong-xia, LIN Zong-jian, LI Guo-zhong, and SUNYing. UAVs for Generation of Digital Large-ScaleOrthophotos, Journal, Chinese Journal of Electron Devices,2008, Vol.31, No.1 (in Chinese)[2] LIN Zongjian. The International Archives of thePhotogrammetry, Journal, Remote Sensing and SpatialInformation Sciences. Vol. XXXVII. Part B1. Beijing 2008[3] YAN Lei, DING Jie, ZHAO Shi-hu, LIAN Zhou-hui,GAO Peng-qi, and LIU Yue-feng. Key Technologies andImplementation of a Ground-based Simulation System forDigital Aerial Remote Sensing, Journal, Image Technology,2006, 1(in Chinese)[4] Henri Eisenbeiss. A Mini Unmanned Aerial Vehicle(UAV): System Overview And Image Acquisition, Journal,International Workshop on "Processing And VisualizationUsing High-resolution Imagery", 200489


Spatial - temporal Simulation and Prediction of Sandy Desertification Evolution inTypical Area of XinjiangDunli Liu 1 , Jianghua Zheng 1 Zhihui Liu 1, 2, * , Fei Wang 11. College of Resources and Environment Science, XinjiangUniversity, Oasis Ecology Key Lab of National EducationBureau, Urumqi 830046, Xinjiang, Chinae-mail: ldl_rain@126.com, itslbs@gmail.com2. International Center for Desert Affairs-Study onSustainable Development in Arid and Semi-arid Lands,Urumqi, 830046, Chinae-mail: lzh@xju.edu.cn, moncak2008518@yahoo.com.cnAbstract— A quarter of territory in Xinjiang is coveredwith desertification landscape, and the monitoring resultsindicate that it’s still expanding with the speed of 84.5km2every year. The severity of desertification has become a threatto the sustainable development of oasis ecology and the livingspace of human. A topical research area was located in thenorthern part of Hotan Oasis. Following methods were used:Firstly, remote sensing data in three different time phases wereused to classify and analyze sandy desertification of the area.Secondly, main factors of sandy desertification were selectedand quantified to establish the Sandy Desertification Index(SDI, for short) model. Thirdly, sample points were chosen anddata were collected based on sandy desertification types, takingthe year of 2000 as the benchmark. The points were used tocalculate SDI and match sandy desertification types, whichwere simulated by the changing of SDI in 1990. So the sandydesertification distribution map could be reconstructed byInterpolation Algorithm and matched with remote sensingimage. Finally, Linear Regression model was used to predictmain factors of sample points, and the forecast of potentiallyspace evolution would be realized by the SDI, which set 2010 asa target. Four results were reached by the research. 1) Sandydesertification expanded quite obviously in the area nearly inpast three decades, accounting for 5.98% of the total. Theserious sandy desertification increased from 19.39 km2 to 41.86km2, showing the most prominent evolution. The high andlight ones expressed mild increase and the modest one declinedin the meanwhile. 2) SDI model was built by seven mainfactors. The real sandy desertification types of forty samplepoints were matched with their SDI based on observationaldata in 2000, and the result showed 82.5% of the correct rate.3) Sandy desertification map of 1990 that built through SDIhad a quite high degree of simulation to remote sensing image,with only -2.72 percent of the relative error. 4) The situation ofsandy desertification in study area would be aggravating in2010 through the prediction. There would be 11.2 km2farmland annexed by desert, the area of serious and lightsandy desertification would increase 19.84 km2 and 8.88 km2respectively. The conclusion is that the SDI of sandydesertification model can reasonably reflect sandydesertification types. We can realize the spatial-temporalsimulation and prediction of sandy desertification evolution bythe model. And it provides more effective and intuitiveexpression and decision support for sandy desertificationresearch and prevention.Keywords-Sandy desertification; spatial-temporal evolution;SDI; simulation and prediction; XinjiangI. INTRODUCTIONAs one of the important environmental problems, sandydesertification impacts and haunts the survival of allhumankind and the sustainable development of society. Notonly it threatens the entire human environment, but hasbecome the barrier of global economic development andsocial stability. For current researches, such aspects as thetypes, distribution, monitoring, evaluation, prevention, trendforecast, movement mechanism, have been relatedworldwide [1-8]. At present, however, as one of developingwords, there is no uniform definition of sandy desertificationconcept. The research integrates current views and definessandy desertification as following: under the condition ofsandy surface on arid, semi-arid and partly semi-humidareas, the fragile balance of natural ecosystem is undermineddue to the impacts of natural factors or human activities, andthen the land degradation gradually forms, which indicatesby wind and landscapes of wind erosion, making the similardesert landscape comes into being in regions where neverappeared before [9-12].Xinjiang is located in the hinterland of Eurasia, which isa province where there are the widest desert distribution andseverest harm in China. And it is one of the most seriousregions which suffer sandy desertification in the world. Theunique terrain of two basins clipped with three mountainscauses the dry and windy natural climate, which provides acongenital condition for the formation and expansion ofsandy desertification. The area of sandy desertification inXinjiang has reached 4.3×10 5 km 2 , which is 25.8% of thetotal, and recently increases by an annual rate of 84.5 km 2 ,according to the census data in 2000 by Forestry Survey andDesign Institute of Xinjiang. Researches have been donesince 1970’s. The study area mainly focus on Tarim Rivervalley and southern margin of the Tarim which relate todesertification types, trends, causes or drivers, as well as* Corresponding Author. Tel: +86+991+8582378, Fax: +86+991+8585504, E-mail: lzh@xju.edu.cn90


control measures and so on [13, 14]. However, it is verydifficult to achieve expected results by purely theoreticalexpression and numerical analysis particularly for the largescaledesertification and the slow changing process.Management and policy-makers need a better visual andsimulated process for the direct support of preventing andcontrolling desertification. In this research, SDI (SandyDesertification Index) model was built based on the maineffective factors of desertification while the classificationand analysis of desertification was done by using of remotesensing data. And index and types of desertification werematched on the basis of selecting reasonable sample pointsand the spatial-temporal evolution of the latter was expressedby the model. At the same time, the spatial distribution ofparticular year was constructed by Interpolation Algorithm,achieving the expected effect and function of simulating andpredicting spatial-temporal evolution of desertification. Forthis aspect, most scholars only used the impact factors ofdesertification to evaluate development or genetic type,while it is seldom to see using the research method, whichutilizes impact factors of desertification to build SDI model.And it achieves the simulation and prediction of spatialtemporalevolution through changing SDI of desertificationis quite few.An area located in the northern regions of Hotan Oasiswhere has classic sandy desertification evolution is selectedand researched. Three remote sensing images that of MSS(Jul., 1973), TM (Oct., 1990) and ETM+ (Oct., 2000) dataare calibrated and used for sandy desertificationclassification. The desertification evolution is analyzed overthe past three decades. The SDI model is constructed by themain effective factors based on sample points’ data of 2000,and the desertification types of those points are analyzedaccording to the correlation between desertification typesand their SDI. The types’ statuses are simulated in 1990.Furthermore, spatial-temporal distribution of sandydesertification in study area is simulated by InterpolationAlgorithm based on GIS software. The result shows a bettereffect with -2.72 percent relative errors after compared withremote sensing data. And for this reason, the data of samplepoints in 2010 are predicted based on relevant models, andso spatial-temporal distribution of desertification is realized.The results indicate that the situation is more and moreserious in study area. There will be large area of farmlandturns to desert, and the light and serious desertificationexpands substantially, so the control works are imminent.II. ANALYSIS OF SANDY DESERTIFICATION CHANGE INSTDUY AREAA. Overview of the study areaThe oasis located in the southern margin of the TarimBasin is one of the most affected areas with sandydesertification in Xinjiang. And Hotan Oasis has the severestthreats because of the Taklimakan Desert’s expansion, whichbecomes the poorest region in Xinjiang as far as to the wholecountry [15]. In view of these reasons, an area that in thenorthern of Hotan is chosen for research, where is 37°05’~37°11’N, 79°59’ ~ 80°06’E. The area is located in thejunction of desert and oasis with all types of desertification,which gives a good representation. It has 400 (Row) ×400(Line) elements and its real area is 144km 2 , with 1353maltitude, 14.8 temperature, 23.8mm rainfall and 2.1m/s windspeed (data of 2007).B. Remote sensing image process and analysisMSS (Jul., 1973), TM (Oct., 1990) and ETM+ (Oct.,2000) data are used to geometric and radiometric correctionby ENVI, combined with topographic map and other data.The coordinates of study area are treated as the basis forclipping image to ensure the consistency of images. The areais classified into different types on the basis of their imagecharacters and interpretation keys, as well as the indicatorssystem of land classification. And the standard of imageinterpretation is shown in table Ⅰ. The sandy desertificationlands are divided into the light, modest, high and seriousones, based on the classification standard provided by theFederal Agriculture Organization (FAO, for short) and theUnited Nations Environment Program (UNEP, for short)[16-18].There are complex lands in study area with bothdesertification and non-desertification ones, such asfarmland, woodland, settlements and so on. In order tohighlight the focus of this research, the desertification landsare classified minutely and the non-desertification onescombined into farmland. Classification results are shown infigure 1.C. Change analysis of sandy desertification in study areaThe spatial - temporal evolution of the research area inpast 30 years are identified and analyzed based ondiscriminated standard of sandy desertification types, so thestatistics of sandy desertification area for different types andperiods can be obtained easily by built-in statistical functionsof ENVI (Table Ⅱ).The Shift Degree (SD, for short) is introduced to showthe changing of different sandy desertification types’ areamore intuitively, which refers to the changes of sandydesertification in a time unit. One year is put as the time unitin this study because sandy desertification is a slow process.That is, SD is annual growth rate and is expressed by theformulary as follows:Aj− Ai1SD = × × 100%(1)AiTIn formula (1), A i means initial areas of desertificationlands and A j expresses final areas, T shows time interval [19-21]. The total SD of study area is 2.64% from 1973 to 1990,which can spot from table Ⅲ. The serious desertification haslargest SD, arriving 6.13%. The second one is highdesertification with 2.68% declines, but light and modestones’ are quite little. So it can be seen that the light, modestand high desertification have different degrees of reduction,and the reduced area have turned to be serious one. Between1990 and 2000, the overall SD is 2.69%, which has slighterincrease than 20 years ago. The SD of light desertificationbecome larger with 2.59%, and the follows are modest, high91


and serious ones in turn. Overall, the desert land increasedfrom 91.59 km 2 to 100.10 km 2 in past nearly three decades,accounting for 5.91% of total area. The seriousdesertification shows the most obvious increase and themodest one reduces quite apparently. Modest desertificationtranslates into serious one with large areas as a result ofpeople’s negligence to desertification, as well as overgrazingand reclaiming continuously.III. SDI MODEL CONSTRUCTIONA. Effective factors selection of sandy desertificationSandy desertification in Xinjiang has various reasons,which can be divided into two groups, natural group andcultural one. The desertification process is exacerbated bytheir interaction effects. Land surface with loose sand, aridclimate and sparse vegetation leads to the weak ecosystem,which provides congenitally material basis and driving forcefor the occurrence and development of sandy desertification.Therefore, factors of wind speed, temperature andprecipitation and vegetation of surface are the essentialelements in process of sandy desertification. The annualwind speed, annual temperature, annual precipitation andvegetation coverage are used to express the factors above,taking one year as the time unit.At the same time, the formation of sandy desertificationin Xinjiang also has substantial connection with intensity andscale of socioeconomic human activities based on manystudies. Yudong Song [15] said “Sandy desertificationcaused by unreasonable reclamation and utilization of waterresources takes up 85% of all”. Variety of excessivelyeconomic activities that result from population and livestockpressure are the main triggered factors of sandydesertification, and the population density and grazingcapacity are used for represent them in this research. On theother hand, the degree of sandy desertification is also relatedto social economy, which represents the exponential increasewith the level of economic development. It is expressed bythe index of Net Revenue of Per Capita (NRPC, for short).B. Quantification and classification of effective factorsSelected factors are annual wind speed, annualtemperature, annual precipitation, vegetation coverage,population density, grazing capacity and NRPC. Theirclassification standard is proposed according to what wasprovided by FAO and UNEP in 1984, combining of internalresearch results and observed data over the years of studyarea. All the factors are divided into light, modest, high andserious types according to their values with reference to theclassification of sandy desertification (Table Ⅳ) [22-24].C. SDI model constructionSDI is the quantization of sandy desertification types insome study area or point. The model is constructed throughintensity index values of desertification effective factors andtheir weight coefficients. In formula (2), η means adjustmentcoefficient, W i expresses weight coefficient of factor i and F ishows its intensity index, N is used to indicate number offactors, which is five in this study.NSDI = η • ∑W i• Fi(2)i=1Reasons of using adjustment coefficient: There are othertypes of land except sandy desertification, such as farmlandand woodland, which have high vegetation coverage as nondesertificationland. For the performance of a morecomprehensive type of land without detailed analysis fornon-desertification, vegetation coverage is treated as thebasis of discrimination. So the value of η is 0 when thevegetation coverage is more than 90%, otherwise 1.The value field of intensity index F i is [0, 1] for thecalculation advantage. So different factors can identify theirrelevant intensity index by the classification standard shownin table Ⅳ, and their specific value can be obtained throughinterpolation method.Commonly used methods for weight coefficient areAnalytic Hierarchy Process, Delphi, and so on. They areintervened with subjective factors more or less, which makethe results less scientific. Principal component analysis isintroduced in this study based on statistical principle. Theweight coefficient is shown by the proportion that eachfactor’s eigenvalue of the total in the correlation matrix.Standardized method (z- score) is used to correlation matrixand eigenvalue calculation based on data of the factorsduring 1973 to 2000. The result is as follows through SPSS(Table Ⅴ) [25].IV. SPATIAL-TEMPORAL SIMULATION OF SANDYDESERTIFICATION EVOLUTIONA. Sample points selection and data acquisitionSample points should be selected from different landtypes as full as possible, which are light, modest, high, andserious and non-desertification area, combined with residentsand livestock activities, vegetation types and distribution,and so on, according to distribution of desertification typesand observation data in 2000. There are forty points in studyarea, eight ones in each land types. At the same time, samplepoints’ areas choose 30×30 m totally, which can convenientfor the resolution of remote sensing image. One method ofaveraging different data which observed by many timesaccording to the characteristics of seasonal variation is usedto ensure the effective factors’ data more rationally andobjectively, under the condition of not doing continuousobservation.B. SDI calculation and type discriminationThe value field of SDI model is [0, 1], which can becalculated by that of its parameters. Doing differentcalculation to the SDI value field, and then, relationshipsbetween them and sandy desertification types will come truebased on the standard provided by FAO and UNEP, and realsituation of study area (Table Ⅵ).According to the relationship between SDI model anddesertification types, sample points are treated with SDIcalculation and sandy desertification type discrimination, andcompared with the real situation. Result is shown in table Ⅶ.92


There are forty points be selected and disposed, seven ofwhich are wrong. The accuracy rate is 82.5% and it can meetresearch requirement. There are both lenient and overweightresults among the error points. They are just betweenadjacent levels but not with skip-level ones. For a furtherstep, data of SDI for those error points are almost in thebrink of two types, which mainly caused by a little bigger orsmaller data of some factor. It is indicated that therelationship between SDI and desertification types shouldneed for further amendment. But it has reached the standardof this study due to the high discrimination rate.C. Spatial-temporal simulation of sandy desertificationevolutionLinear regression model and moving average modelwhich are used to project their value in 1990 based on annualdata sets from 1973 to 2000 and points’ data in 2000 due tothe limitation of effective factors’ data for sample points.The SDI is calculated and sandy desertification is inverted ofeach point, and then interpolation method is used to simulatedesertification distribution in design year, under the supportof computing and image analysis functions of GIS (Figure2a).Statistical results of comparing the simulation withremote sensing image (Figure 1b) can be seen from table Ⅷthat light desertification shows biggest error with 13.90%,which mainly due to the selected points can not reflect slightsmaller regions of light desertification in farmland. Therelative errors of modest and high desertification are -10.04% and -6.64% respectively, performing smallersimulation area. However, the serious one shows bigger areathan that of remote sensing image. It can be seen from bothimages that partly modest and high desertification aredistinguished as the serious ones in northeast region of studyarea, mainly because of the limitations of points selectionand interpolation method. In a word, the relative error is -2.72%, which means that the correct simulation of the totaldesertification area is 97.28%, showing a better simulationresult.V. SPATIAL-TEMPORAL PREDICTION OF SANDYDESERTIFICATION EVOLUTIONAccording to data characters of sample points in 2000,three methods that linear regression model, multipleregression model and time series model are used to predictthe effective factors data of points in 2010, combined withobservation of 35 years from 1973 to 2007 (Figure 3). Sospatial-temporal of sandy desertification evolution can beobtained in the target year, which is expressed in figure 2(b).The area of sandy desertification increase significantly in2010, compared to 2000 based on the analysis of simulatedstatus. There are large areas of light desertification infarmland, partly of which even turns to be modest or highones, and the boundary of farmland moves inside apparently.The areas of various desertification types calculated in 2010are as follows: light one 20.47km 2 , modest one 9.12 km 2 ,high one 20.03 km 2 and serious one 61.70 km 2 , a total of111.33 km 2 , which is 77.32 percent of study area. It can beseen that there are 11.2 km 2 farmland ate up by desert andserious desertification land increased by 13.8%, mostly comefrom modest and high ones. Overall, the growth rate ofdesertification after 2000 is more serious than before, whichshows imminent management for sandy desertification instudy area.VI. CONCLUSION AND DISCUSSIONResearch of desertification started in 1977, when theUnited Nations Conference on Desertification (UNCOD)was held in Nairobi, Kenya. And after that, the concept of"sandy desertification" was put forward for the first time inChina. Based on existing research results, that is, the maineffective factors of desertification, SDI model is establishedand combined with classification standard of desertificationtypes. The function of expressing the latter changes throughthat of the former will come true, so as to the reconstructionand prediction of spatial-temporal of sandy desertificationevolution. The results are as follows: Firstly, SDI modelestablished by main effective factors can correspond to thetypes of desertification well , and the simulation andprediction of spatial-temporal of desertification evolution arecarried out through changes of SDI. Secondly, there will be aphenomenon of significant desertification expansion inresearch area in next few years, and even more serious thanbefore, if current land development patterns and policies aremaintained all the same. So it is necessary to take remedialmeasures as soon as possible.There is a certain lack in this research. The first one is thefewer selected sample points. There are no full points in thearea and the representation of data is not enough because ofthe limited personnel and equipment constraints. Secondly,the data of effective factors failed to be observed andobtained continuously, resulting in incompletely correct ratein discriminating the type of desertification, and even notproviding a better support for the possible prediction of thedevelopment. The man-machine data acquisition andwireless remote transmission used in further research canmake up for these deficiencies quite well.ACKNOWLEDGEMENTThis paper is supported by the Projects of CultivateFoundation of Science and Technology Innovation Programof China (No.708090), National Scientific Foundation ofChina (No.70361001, No.40871023), Project ofMeteorological Desert Research Foundation of China(Sqj2007004) and Open Project of the Key Laboratory ofOasis Ecology (Xinjiang University) Ministry of Education(XJDX0201-2008-10).REFERENCES[1] Mario G. Manzano, Jose Navar, Processes of Desertificationby Goats Overgrazing in the Tamaulipan Thornscrub(Matorral ) in North-eastern Mexico[J], Journal of AridEnvironments, Vol.44, 2000, PP.1-17.[2] J.C. Bathurst, J. Sheffield, X. Leng, G. Quaranta, DecisionSupport System for Desertification Mitigation in the AgriBasin, Southern Italy[J], Physics and Chemistry of the Earth,Vol.28, 2003, PP.579-587.93


[3] Cheng Weiming, Zhou Chenghu, and Liu Haijiang, etc.,Research of Oasis Expansion and Eco-envionmentalEvolution during 50a in Manas River Basin[J], Science inChina, Ser D Earth Science, Vol.35, No.11, 2005, PP.1074-1086.[4] Kjeld Rasmussena, Bjarne Foga, and Jens E. Madsenb,Desertification in Reverse? Observations from NorthernBurkina Faso [J], Global Environmental Change, Vol.11,2001, PP.271-282.[5] Zhao Wenzhi, Chang Xueli, and He Zhibin, etc., Ejina OasisVegetation Ecological Water Requirement Study[J], Sciencein China, Ser D Earth Science, Vol.36, No.6, 2006, PP.559-566.[6] Wang Ninglian, Yao Shandong, and Yang xiangdong, etc.,Trends of Sandstorms’ Frequency in the 20th CenturyReflected by Ice Core and Lake Sediment records in NorthernChina[J], Science in China, Ser D Earth Science, Vol.37,No.3, 2007, PP.378-385.[7] Lei Jiangqiang, Mu Guijin, and Wang Lixin, A BriefIntroduction on the Progress of Systematic Studies on MainBiological Events in the Eukaryotes [J], Science Foundationin China, 2005, PP.268-276.[8] Li Xiangyu, Li Shuai, and He Qing, An Overview of Study onSandy Desertification[J], Arid Mereorology, Vol.23, No.4,2005, PP.73-82.[9] M. Nael, H. Khademi, and M.A. Hajabbasi, Response of SoilQuality Indicators and Their Spatial Variability to LandDegradation in Central Iran[J], Applied Soil Ecology, Vol.27,2004, PP.221-232.[10] S.M. Herrmann, C.F. Hutchinson, The Changing Contexts ofthe Desertification Debate [J], Journal of Arid Environments,Vol.63, 2005, PP.538-555.[11] Wang Xunming, Li Jiejun, and Dong Guangrong, etc.,Response of Climate evolvement and sandy desertification inSandy Area near 50a, Northern China[J], Science Bulletin,Vol.52, No.24, Dec., 2007, PP.2882-2888.[12] Zhu Zhimei, Yang Chi, and Cao Mingming, etc., Analysis onthe Soil Factor and Physiological Response of the Plants inthe Process of Sandy Desertification on Grassland[J], ActaEcologica Since, Vol.27, No.1, Jan., 2007, PP.48-57.[13] Wang Ranghui, Fan Zili, Study on Land Desertification withRS and GIS Techniques in Alagan, the Lower Reaches ofTarim River[J], Journal of Remote Sensing, Vol.2, No.2, May,1998, PP.137-142.[14] Han Guihong, Tuerxun. Hasimu, and Shi Li, Discussion onLand Desertification and Causes in Lower Reaches of TarimRiver[J], Journal of Desert Research, Vol.28, No.2, Mar.,2008, PP.217-222.[15] Yi Deting, Study on the Desertification in Xinjiang and ItsCauses of Population and Economy [D], Xinjiang: XinjiangUniversity, 2003.[16] Wang Tao, Wu Wei and Xue Xian, etc., Time-spaceEvolution of Desertification Land in Northern China[J],Journal of Desert Research, Vol.23, No.3, May, 2003,PP.230-235.[17] Wang Tao, Wu Wei, and Xue Xian, etc., Spatial-temporalChanges of Sandy Desertified Land during Last 5 Decades inNorthern China[J], Acta Geographica Sinica, Vol.59, No.2,Mar., 2004, PP.203-212.[18] Zhu Zhenda, Chen Guangting. Sandy desertification land inChina[M], Beijing: Science Press, 1994, PP.132-133[19] Chen Yalin, Chang Xueli and Cui Buli, etc., DynamicsAnalysis on Development of Desertification in HobqDesert[J], Journal of Desert Research, Vol.28, No.1, Jan.,2008, PP.27-34.[20] Li Sen, Yang Ping and Wang Yue, etc., Preliminary Analysison Development and Driving Factors of Sandy Desertificationon Ali Plateau[J], Journal of Desert Research, Vol.25, No.6,Nov., 2005, PP.838-844.[21] Dong Yuxiang, Liu Yihua, Study on Assessment CriteriaSystem for Hazard Degree of Desertification Disaster[J],Journal of Catastrophology, Vol.9, No.1, mar., 1994, pp.8-12.[22] Dong Yuxiang, Study on the Assessment Model for Hazarddegree of Sandy Desertification[J], Scientia GeographicaSinica, Vol.15, No.1, Feb., 1995, PP.24-29.[23] FA0 and UNEP. Provisional Methodology for Assessmentand Mapping of Desertification [M], 1984.[24] Zhu Zhenda, Judgment of Concept and development forsandy desertification [J], Journal of Desert Research, Vol.3,No.4, 1984, PP.2-8.[25] Yang Shiqi, Gao Wangsheng, and Sui Peng, etc., QuantitativeResearch on Factors of Soil Desertification in GongheBasin[J], Acta Ecologica Sinca, Vol.25, No.12, Dec., 2005,PP.3181-3187.TABLE I.REMOTE SENSING IMAGE CHARACTERS OF SANDY DESERTIFICATION LANDSSandy Desertification Types Image Characters Other CharactersLight Massive and do not has rules, light red There are red dots among light red regionModest Lumpy and do not has rules, pink Uneven ground with sand distributionHigh Irregular patches, brown-yellow Clear sand dunes and have brushwood dotsSerious Broad distribution, brown-yellow Apparent landform of sand dunes and ridgesTABLE II.STATISTICS OF DIFFERENT TYPES OF SANDY DESERTIFICATION ALight Modest High Serious TotalYearArea Percent Area Percent Area Percent Area Percent Area Percent1973 10.01 6.95 31.50 21.88 30.69 21.31 19.39 13.46 91.59 63.601990 9.21 6.39 17.14 11.91 28.93 20.09 39.58 27.49 94.86 65.872000 11.59 8.05 14.29 9.92 32.36 22.47 41.86 29.07 100.10 69.52A. Units: area km 2 , percent %.94


TABLE III.SPREADING RATES OF SANDY DESERTIFICATION LANDS DURING DIFFERENT PERIODS IN RESEARCH AREAPeriodShift Degree/%Light Modest High Serious Total1973-1990 -0.47 -2.68 -0.34 6.13 2.641990-2000 2.59 -1.66 1.19 0.58 2.69TABLE IV.CLASSIFICATION STANDARDS OF MAIN EFFECTIVE FACTORS OF SANDY DESERTIFICATIONEffective factors Light Modest High SeriousAnnual wind speed (m/s) 5.9Annual temperature ( ℃ )15.0Annual precipitation (mm) >80.0 80.0~55.0 55.0~15.0 50.0 50.0~30.0 30.0~10.0


(a) 1973 (b) 1990 (c) 2000Figure 1.Spatial - temporal change of sandy desertification lands during past nearly 30 years in research area(a) Simulation of sandy desertification in 1990 (b) Forecast of sandy desertification in 2010Figure 2.Simulation and forecast of the sandy desertificationFigure 3. Effective factors data of research area from 1973 to 200796


Section 2CCOMPUTER NETWORKS 197


Effect of Hard RTOS on DPDC SCADA SystemPerformanceA. M. Azad, C. M. Hussain and M. AlamElectronics and Communication EngineeringBRAC UniversityDhaka 1212BANGLADESHa.azad@bracuniversity.ac.bdAbstract-Supervisory Control And Data Acquisition (SCADA)system is extensively used in power systems specifically formonitoring different power parameters, operating andcontrolling power electronics as well as other high voltageelements, for example breakers tripping or constant dataprocurement form current, voltage or power transformers.SCADA system failure can be an effect of inevitableconsequences which include equipment damage, customer loadlosses even life losses. Dhaka Power Distribution Company Ltd.(DPDC) former DESA has been using SCADA over a decadewhich was developed by ABB. After its introduction in DPDC,the SCADA system hardly had any performance upgrade. Atpresent the entire SCADA system is observing many problemsthat are rendering the whole structure obsolete. At the earlystage ABB came across some limitations which later on weresolved in such a way that may not convene the time precisionthat present technological development demands. ABB usedsoft real time operating system UNIX. This OS usuallyresponses with high latency which sometimes caused someremote power elements (breakers) to fail in certain time frame.The communication structure is based on microwave linkswhich certainly has some shortcomings in terms of SCADAsecurity. Evaluating these consequences, this paper presentssome draw backs of current DPDC, SCADA system andproposed Hard RT Linux as an operating.Key words: Hard real-time, Soft real-time, PID tuning, SCADA,RT-Linux, RTU.1 INTRODUCTIONSCADA refers to a system that accumulates data fromdifferent sensors at feeders, relays or in other remotelocations and then propels this data to a central computerwhich then controls those elements according to constrictedinstruction based on the procured data. System analysis inthis paper will be concentrated on former DESA SCADAsystem which has currently been reformed as Dhaka PowerDistribution Company Ltd (DPDC) SCADA system. Thecontrol capability that SCADA systems provide is essentialfor the safe and efficient operation of our electric powergrids.The system-wide monitoring and control functionsprovided by such systems may contain slight shortcomingwhich makes the entire system less effective where stricttime accuracy is in concern. SCADA systems are used inpower systems to monitor, operate and control generation,transformer, and switching and load stations [1,2,3,5,18].Such control can he automatic or manually initiated byoperator commands. A typical SCADA system consists ofthree main components, namely, remote terminal unit(RTU), master control and telecommunication network. Thefailure of any of these components may disable the entiresystem. The master control however, takes part incontrolling the entire network which expectedly should bemuch more compact with advanced equipments and as asame time highly sensitive with respect to time to be moreefficient in sending instructions to its target.The operating system using by SCADA is UNIX, DECOSF/1, which is a soft RT operating system and can bereplaced by hard RTOS for example RT Linux to ensuretime criticalness in every possible measure. This paper willdemonstrate how maximum delay time can be recuperatedand confined the desired loop iteration time by integratingRTOS as DPDC SCADA operating system. In section 2 theformation of DPDC SCADA control system communicationis described followed by section 3 where the method ofSCADA protection is explained in brief. After that insection 4, the structure of SCADA hardware is given and insection 5 the SCADA software will be described with anexperimental presentation of breaker operation in asubsection 5.1. In section 6 the hard real-time operatingsystem is explained with one subsection 6.1 about real-timeLinux and under section 7, related works on RT Linux atBRAC University is explained elaborate including resultfrom the laboratory test with few subsections. The proposalfor Hard RTOS for DPDC SCADA is provided in section 8.2 SCADA, DHAKA POWER DISTRIBUTION COMPANYLTDElectric network management is a unique and provensystem, which after being introduced by former DhakaElectric Supply Authority (DESA) back in 1995, has savedsome Tk 2/3 crore every month [8]. Bangladesh can savecrore of taka in the power sector by adopting a new systemwhich will also improve quality and security of powerdistribution. SCADA system gives a overview of thenetwork and update view of voltage levels and equipmentstates making them instantly aware from anythinghappening over a wide area, like spider in centre of a web.Such a control centre enables to operate to have a windowinto electrical network through computer station. In thegreater Dhaka power communication network of SCADA,the Remote Terminal Unit (RTU) is located with quite adistance from master control unit and communication isestablished by microwave wireless communication.98


One major wireless communication ring was developedby five cells connecting with one SCADA ‘MasterControl’ (Dhanmondi CC). The five cells New MirpurCL-2(2/2), Tongi2CL-5(2/2), NarsingdiCL-3(2/3), BultaCL-3(3/3), and FatullaCL-7 (2/2) are involved with the signalrelaying to the different sub cells or RTUs. The sub cellsreceive signals to execute the power elements. These subcells are not in the communication ring and only can sendthe acknowledgment to the master control through therelay or transponders. The RTU adjacent to the basestations are connected with Ultra High Frequency (UHF)communication and some cases with the pilot cable.3 SCADA PROTECTIVE SCHEMEPower systems are operating close to their design limits.These operating conditions leave little room for error inthe protection and control systems. The pre-eminentapproach to preserve the transient stability of the powergrid High-speed is the fault clearing protection andcontrol systems. These high-speed tripping schemes arecalled pilot protection because they utilize end-to-endcommunications to provide high-speed, simultaneous faultclearing or termination. Remote terminal(s) systemsmust receive a signal from master control to issue a localtripping signal. In case of DPDC (former DESA)SCADA, the microwave communication is in primaryconcern although applicable communications channelsinclude audio tone, microwave, fibre optic, and spreadspectrum radio. These schemes exchange a minimum ofone information bit. Unlike blocking schemes, precisesignal or message timing is usually not critical. The moretime it takes for a relay at one end to receive notificationthat the remote terminal relay also senses a forward fault,the longer the tripping time. A slightly delayed trip duringan internal fault does not comprise a disoperation. It isaccepted that a failure to receive a valid tripping signalcan result in a failure to operate at high-speed for aninternal fault [9]. But at the same time this should betaken into the account that sharp time response canimprove the efficiency of the system.4 SCADA HARDWARE STRUCTUREThe ABB SCADA system consists of a series of servers,consoles, and networking components to build a hardwareplatform on which to install the ABB EnergyManagement Software suite. There is a central processingserver which provides the central core for the SCADAsystem and includes database management, centralizedcommunications, and other critical SCADA functions.The Central Processing server consists of a CompaqAlpha server running Tru64 release 5.1b. Disk storage isprovided with six disk drives. A backup for these drives isprovided in a split SCSI bus cage with 12 disk drives.Each set of six disk drives can be used as the primarydrive system during boot. This allows a fully configuredand functional backup copy of the central processingserver to be available should testing crash the primarysystem.The Real-time Database and Communications serverconsists of a Compaq Alpha server running Tru64 release5.1b. Disk storage is provided by two disk drives acting asa primary and secondary drive. The primary drive ismirrored via a manually run script to the secondary drive.The consoles provide the Human Machine Interface(HMI) for the ABB SCADA/EMS system. In a typicalsystem, there are many consoles, each providing control,analysis, and/or monitoring functions for the ABB system.All PCs on this system are HP Workstations with Xeonprocessors running Windows XP Professional. Diskstorage is provided by a single disk drive. The NVIDIAQuadro NVS graphics system is capable of driving up tofour computer displays.5 SCADA SOFTWARE STRUCTUREDigital Equipment Corporation (DEC) developed a newUNIX implementation based on the OSF/1 specification.Digital UNIX was formerly called DEC OSF/1It wasinitially marketed as DEC OSF/1[14]; DEC renamed thenew operating system Digital UNIX, and then Tru64UNIX when Compaq acquired the company. It is a 64-bitoperating system for workstations and servers equippedwith the Alpha processor [4]. Though it is fully System V-compliant, for both the user and administrator, it behavesmore like a BSD system. On server type DigitalAlphaStation 400 4/233 of server 6 and server 7 isdedicated as Application Servers of DPDC SCADAwhich is compatible with OSF/1 Rev 3.2 comprisingdifferent sort of application software for example DEC,(OSF-BASE, OSF-USR etc) and For Databasearchitecture Data Engineering Tool is installed in Dhaka1server comprising of Oracls7.1.3.2.1for DEC OSF/1 3.0,RDBMS 7.1.3.2.1 etc. In terms of Human MachineInterface (HMI) servers (WS400 Workstation) allocatedfor dhaka2, dhaka3, dhaka4 and dhaka5 (remoteworkstation, LDC) where different HMI software areinstalled [20].SCADA may needs to operate more than oneRTU of different substations in same time. To preservetheir temporal behavior command applications requirethat the underlying systems provide soft real time Qualityof Service QoS Guarantees. In the current DPDC SCADAoperating system (UNIX DEC OSF/1) multi user, multiprocess and time sharing (TS) environment theseapplications do not perform well when they are scheduledconcomitantly with the traditional non RT applicationssuch as text editors compilers or computation intensivetasks. The real time RT applications also do not performwell when other RT applications are scheduledconcomitantly. Untimely scheduling of processes ispartially responsible for this kind of adverse phenomenonrather than insufficient CPU capacity. One potentialsolution for this problem is to serve RT application inUNIX environment is to dedicate the entire system toserve only one RT application. This involves blockingservices to all other RT or non RT applications and users.This solution avoids the potential scheduling problem andit also defeats the UNIX environment goals of supportingmulti user multi process and time sharing properties [17].99


Therefore, this solution is not feasible in the UNIXenvironment.5.1 Breaker Operation of DPDC SCADADPDC SCADA system in kataban, we conducted anexperiment on breaker operation located quite a distancefrom master control. Remote Terminal Units (RTU) aredesigned to respond with the command given from themaster control, provided the breakers are in automaticmood. If the breakers are set to manual condition, themaster control of SCADA system does not have any holdon any of these RTUs or breakers in any circumstance.Since SCADA is a very sophisticated decentralizedelectrical control system we were not allowed to operatethe breakers in extensive measure. We collected fewbreaker operational data which have been done onschedule. Master control operates few two breakers inJigatala and other breakers from Kollyanpur substation. Ineach case breakers were tripped in different time period. Itis to be noticed that the communication delay should beconsidered identical as we operated breakers located intwo different substations not more than 30 Km apart fromDhanmondi CC. In case of any communication timedeviation from the time taken by the communicationsystem, we can neglect the delay in this regard because ofmicrowave communication.SCADA use to operate several breakers each daywhich are centrally controlled by Dhanmondi mastercontrol keeping all the records of tripping data on an eventlist. We procured few of those data which have beengiven in Table5.1 where all mentioned tripping timingwas from two substations, Kollyanpur substation andJhigatala substation. On the Table 5.1 the procuredtripping data from DPDC SCADA is presented.TABLE 5.1: BREAKER OPERATIONAL DELAY ANALYSISAt Kollyanpur substation for example, a breaker wascommanded to trip at 21: 15: 26 which responded after1000ms giving the tripping time at the event list 21: 15:27. However, the desired loop time is to be mentioned as2000 ms which we found on following day’s event list.Master control sent a command to same substation to tripa breaker at 14: 17: 54 which was executed 2000ms laterat 14: 17: 56. The same case we found as a majorityexcept two deviations. We located two large time delayinterims of breaker operational timing when the mastercontrol operated a breaker in Jhigatala in time of 11 hr 49min and 59 sec (11:49:59) and the breaker was tripped at11 hr 50 min and 15 sec (11:50:15) which indicates theoperational time delay is 16 millisecond. In second case adummy breaker of Jhigatala substation was operated at15:00:43 and tripped at 15:00:50 giving operational timedelay 7ms. The graphical presentation of the breakeroperational time delay fluctuation is given in graph 5.1.Graph 5.1: Breaker Operational Loop Iteration of DPDC SCADAThe in figure 5.1 shows the block diagram of SCADAsystem where the SCADA server end consists of CompaqAlpha server and UNIX operating system. The RemoteTerminal Unit (RTU) end is having data acquisition andinterfacing units like marshalling and data conversionblocks. The next block is the communication where all themodulation and demodulation take place beforetransmitting and receiving signals respectively. Aftertransmitting any command signal it reflects back from thereceiving substations with acknowledgment. In thisexperiment the time taken by the receiving ends are notprime concern. This research rather concentrate with thetime taken by the SCADA master control and the processof restrict the breaker tripping loop iteration withindesired time. The block diagram indicates the delay timetaken by the six different blocks, SCADA server delays erver transmitter and receiving data acquisition card timedelay τ DAC and the both transmitter and receiving endcommunication delay refers to τ Com and in the receivingend the circuit breaker delay τ CB. Since the communicationsystem is using microwave, the delay regarding this canbe negligible. The data acquisition card at RTU takesreasonable amount of time and it could be improved byincreasing the pulse frequency of digital to analogconverter or analog to digital converter.Analyzing the nature of the time delay, the entiretripping loop iteration or in the other way the minimumdesired loop time includes 2 * (τ Server + τ DAC + τ Com ) .These six parameters cumulatively give the time delay of100


Fig. 5.1: Block Diagram of DPDC SCADA Communication between Server & Clientapproximately 2000 ms and most of the time delay takesplace from our findings in SCADA server end which isabout 1500 ms. Breaker operational Time fluctuations occurdue to multitasking operation and running other software fordifferent applications. On the figure 5.1, minimum loopiteration was measured as 1000 ms few times. However, thedesired loop iteration is counted as 2000 ms which in fewcases can be exceeded drastically. In Jhigatala substation on5 th April and 9 th April <strong>2009</strong> can demonstrate the scenario ofmaximum delay occurred in a breaker operation. On 5 thApril in a certain incident the breaker tripping loop iterationof 16000 ms provides in this case a Jitter of 14000 ms.Operational time delay variation depends on manyparameters on which one of the most frequent reason is dueto operating system. As soft RT operating system does notwork well when other non RT tasks run simultaneously.Apparently the operating system of SCADA which is in thiscase DEC OSF/1 for DPDC causes the influential effect forthe instability as soft RTOS is vulnerable to interruptionwhile multitasking. SCADA operates their entire DPDCcontrol network by five console computers on UNIXplatform. Each of them may involve at different tasks at thesame time for example printing data, breaker commands andstatus updates of RTUs. These multitasking may cause theoperational periods to be fluctuated significantly in eachcommand. The Jitter of 14000ms could have been halted bya time critical operating system like Real-Time Linux. RTLinux can reduce the maximum time delay and restrict thedesired loop iteration within 2000ms.6 HARD REAL-TIME OSA hard real time system is an information system, whosecorrection depends on moment in time when logic outputoccurs rather than the logic output of the algorithm. Thehard real time application fails if their operating systemtiming requirements are not satisfied [7, 12]. The output isrequired to reach within a precise time interval. A real timesystem is not necessarily fast, but must be accurate in timerather then resulting correct output. The design of the realtime system goes under multiple stages. Identification of thetask, which suppose to be performed, and the satisfactorytemporal restriction come at the first stage. The next stagegoes under the code is developing and the measuring runtime of each task and then precision of time constrain test isexecuted to ensure that the tasks will not miss their deadlinewhile the system is running. However, the soft real timeapplications tolerate large latencies in what they haverequested from the operating system. The real time system isimplemented with a combination of Linux, RT-Linux, dataacquisition cards, source code and standard PC.7 RELATED EXPERIMENTS ON REAL-TIME ENVIRONMENTEvidently in case of multitasking soft real time operatingsystem works less efficiently as traditional non RT tasks canbe interrupted. Besides response of a system can beminimized and countered by running the tasks under hardRT operating system, for example RT Linux. An experimenthas been performed under Control Application ResearchGroup (CARG) of BRAC University by implementing aPID controller algorithm in RT-Linux environment tocompare the PID controller step response in differentenvironment (soft and hear real-time) and improving theperformance over the Windows environment illustrated in[7]. The experiment regarding SCADA system wasreplicated with the PID controller because of two inevitablereasons, SCADA control equipments were far beyond ourreach and to compare the performance between twooperating system windows XP and RT Linux, Knopix3.0since there was a unavailability of UNIX as DEC OSF/1.The Real-Time system is the accurate system which giveslowest latency on processing different parameters ofproportional, integral and differential controller and throughthis real-time environment the servomotor control unit iscontrolled. This experiment shows that the delayperformance of PID algorithm can be significantlycountered under real-time implementation. Reducing thedelay and enhance the stronghold controller act in RT-OSleads the system with minimized measurement noise. Inorder to obtain best response sampling rate of 10msec ischosen for soft real-time case and 1msec is chosen for hardreal-time case. The time responses are examined both inhard real-time and soft real-time environment. The resultswhich will be demonstrated on the subsequent part of thispaper in hard real-time performed faster than in soft realtimedue to the lower latency of RT-Linux. Multitasking is avery important factor in our cyber world. But it will be moreefficient when it will be uninterruptible. On the other handaccuracy is another unavoidable factor [19]. In a differentexperiment another experiment has been done on implementa closed loop system for stepper motor using RT LINUX incase of multitasking. As a result the total system can driveparallel task simultaneously and error free driving of steppermotor. A large number of work and application has been101


already done on error free system of stepper motor. Someare based on microcontroller, some are using LINUX basedprogram and other are using modified driver circuit. But inthis particular work the specialty is to drive the steppermotor with maximum accuracy in case of uninterruptedmultitasking where the task is processing in a RT kernel. Inthis work the main factors are the DAQ card which is usedfor multitasking, an optical encoder to get the feed backfrom the stepper motor and finally a program which is runthe entire system and if any type of error occur, it will takethe necessary immediate action. So the accuracy in case ofmultitasking environment will be maximized. The results insubsection 7.2.1 showed that RT Linux can perform moreconsistently on multitasking condition with relatively lessresponse time.7.1 Hard Real-Time PID ControllerIn this work the PID controller implementation under softreal-time is presented in the VCL program on MS15 –DCmotor control module (Servo controller). The systemincluded a 120 MHz Pentium laboratory PC with WindowsXP installed, connected a servo system through a parallelport. The servo system module enables the user to performclosed-loop, positional or a speed control of a DC motor.Speed of rotation and positional feedback information areavailable in both analog and digital forms, but particularly inthis experiment the module was controlled by analog system[16].The PID controller under RT-Linux environmentcomprised a 120 MHz Pentium laboratory PC had been inuse for soft RTOS with the feedback modules and ax5411data acquisition card. RT-Linux 3.0 has been loaded toexecute the real-time thread of PID controller [7]. A realtimethread program is developed based on the PIDalgorithm in terms of the necessary requirements of thesoftware [7, 13]. A priority scheduling real-time kernel wasimplemented for the data acquisition card with dissimilarsampling and hold rate. During the initialization step, thetasks are assigned to each module. The flow of data to andfrom the data acquisition card is implemented with real-timetasks through real-time FIFO. Interrupt-based techniqueshad been applied to handle three real-time tasks mentionedas A/D conversion, real-time thread program execution andD/A conversion. The inter-task communication among thethree real-time tasks is done with shared memory. The RTkernelreceives a fixed set of tasks at the time ofinitialization and each of the tasks has a priority levelassigned in the pre-emptive scheme. The real-time taskcommunicates with non real-time Linux and the dataacquisition card and the RT-FIFOs avoid message losses.High level interfaces between the user and the experimenthad been provided by non real-time GUI program usingGTK (GIMP Tool Kit) in the Linux environment. The nonreal-time critical set of tasks can be mentioned as datalogging, display and GUI. The real-time tasks deliver resultsto Linux at a low rate and final data through shared memoryand circular buffer under the data acquisition scheme.7.1.2 PID Controller Experiment ResultInitially of the experiment, soft real-time PID controller hasbeen considered [7]. The servo system was operated usingZiegler-Nichols parameters and the system response hadbeen observed in terms of delay time, rise time, peak time,settling time and overshoot. The same experiment wasperformed using RT-Linux real-time thread program. Thecomparative results in soft real-time and hard real-timeenvironment are as follows:ParametersTABLE 7.3: T D , T I , T P , T S PARAMETERS [7]Softreal-timePIDHardreal-timePIDFaster rateDelay Time, T D 0.10s 0.064s 1.56 timesfasterRise time, T R 0.19s 0.103s 1.84 timesfasterPeak Time, T P 0.31s 0.146s 2.12 timesfasterSettling time, T S 0.70s 0.35s 2 timesfaster% Overshoot, M p 25% 25% SameThe delay time is the time required for the response to reachhalf the final value. From the table above it is found that thedelay time T D = 0.10 second for soft real-time PID controllerand the hard real-time it is reduced to 0.064 second which is1.56 times faster than the soft real-time. Responses of servosystem with PID controller in soft and hard real- timeenvironment with necessary parameters are illustrated in [7].T R is the time required for the response to rise from 0% to100% of its final value. For the soft real-time, the rise timeis 0.19sec and for hard real-time it is 0.103sec. Thus, theresponse time is 1.84 times faster in hard real-time PIDcontroller. The peak time, T P (time required for the responseto reach the first peak of the overshoot) is 0.31sec for softreal-time where on the other hand for the hard real-time, it is0.146 sec which gives 2.12 times improvement. Settlingtime T s is also 2 times faster in hard real-time. The settlingtime zone is considered ± 2%. Percentage peak overshoot is25% for both (hard real-time and soft real-time PIDcontroller) case. Overall, it clearly illustrates that there isimprovement in system response for hard real-time PIDcontroller over soft real-time PID controller.7.2 RT-LINUX Stepper Controller on MultitaskingTo get the maximum performance of stepper motor,different types of improvement and work has been done. Inthis work a control system of stepper motor by using RealTime Linux has been implemented [19]. RT Linux programalgorithm was used where the motor is controlled in Realtime kernel so no other task can interrupt this task in case ofmultitasking. Data Acquisition Card (PCL-812PG) wasconnected to sent signal and receive signal from hardwareinterface. A GUI interface controller was also developed inthis work to make it easier to use. The Linux basedsimulation program interacts with hardware interface todrive the stepper motor in real-time environment in case ofmultitasking mode. So the driver circuit drives the motor byusing the command from the RT program by using the DAQcard. Stepper motor is largely used for the purpose ofaccurate and efficient movement. However, miss steppingleads to main problem, therefore many types of work havebeen done to find the remedy including developing manytypes of algorithm and programs. But in case of multitaskingthat work are not much efficient and the non RT task can beinterrupt by other tasks in soft or non RTOS. The system can102


e more If RT Linux takes the action in multitaskingenvironment. In this work a RT Linux PC controlled theentire system in real-time method. A PC-812PG DAQ cardis used to send and receive data from the hardware interfaceand multiple tasks can be controlled by this device. PC-812PG is a high performance, high speed and multi-functiondata acquisition card for IBM PC and compatible computers.It is ideal for wide range of applications in industrial &laboratory environments. These applications include dataacquisition, process control, automatic testing & factoryautomation. The main purpose of this card is to receive andsend signal to the external hardware interface. It is mainlyused for multitasking in real-time environment. Drivercircuit ULN2003 had been in use to receive data from theDAQ card and drives the unipolar stepper motor accordingto the data. The unipolar stepper motor is having 5 wiredstepper motor with step size is 1.8 degree. It gives in clockwise or anti clock wise rotation. It mainly converts theelectrical pulse into mechanical signal.7.2.1 Stepper Motor Experiment ResultsAn experiment in BRAC University laboratory has beenperformed to compare the control of stepper motor indifferent environment (RT Linux, non RT Linux andWindows). DSO was used to measure the reading for eachof the stator signal from the stepper motor in case ofmultitasking. At that time four programs were runningsimultaneously for each of the cases which make the CPUoccupied near about 90% for RT LINUX and 70% for NonRT Linux and Windows. Then the compression of theprocured result varying the time resolution for 10 ms, 1s and2 second had been done. The table 7.2.1 illustrates that incase of 1 sec time resolution for 90 degree full rotationWindows takes 7.92s where non real-time Linux takes 6.16sand hard RT Linux takes 6.00s in multitasking mode. It isproved from this experiment that RT is better than Non RTLinux and performs much better than windows OS.Increment of the timing resolution up to 10ms allows nonRT and windows to give irregular movement of the steppermotor, however RT Linux worked properly. After increasingthe time duration up to 2s, some miss steps occurs both insoft RT Linux and windows but on the other hand RT Linuxwas working accurately.TABLE 7.2.1: STEPPER MOTOR FOR 90 DEGREE ROTATIONmultitasking in both real-time and non real-timeenvironment. In terms of controlling a stepper motor the RTLinux based control algorithm performed with real-timeresponse and error correction feature comparing to any othersoft or non real-time operating system.Fig. 7.2.1: Wave form of Windows under Multitasking (1sec delay/pulse)The wave forms for soft RT Linux, windows and Hard RTLinux are given in the three consecutively on the figureabove. There are other sets of responses can be found from10ms, 2sec and the nature of wave form can be measuredfrom those figures.Fig. 7.2.2: Wave form of soft RT under Multitasking (1sec delay/pulse)Parameters Windows Soft RTLinuxTime Difference Random movement Irregular(10ms period) with severeWaveforminterruptionTime Difference(1Sec period)Time Difference(2Sec period)Hard RTLinux59.2 ms7.92 s 6.16 s 6.00 s0.31s 0.146s 2.12 timesfasterThis control system works in between 1ms < T < 3s.Because in case of 1ms there is irregular movement of motoris occurred and in case of using any time duration which isgreater than 3s gives some irregular movement too. Forclosed loop control system of stepper in case ofmultitasking, RT Linux based algorithm is used to controlthe stepper motor and error correction as well. In thisexperiment had shown the response of the system in case ofFig. 7.2.3: Wave form of Hard RT under Multitasking (1sec delay/pulse)103


8 PROPOSED REAL-TIME OPERATING SYSTEM FOR DPDCSCADA SYSTEMAs it was mentioned before, the real-time applications alsodo not perform well when other RTs are scheduledconcurrently. To keep this OS on progress one feasiblesolution is the current RT extension to UNIX. The UNIXPOSIX.4 real time extension provides fixed priorities to realtime applications. The priority scheduling rule dictates thathigher priority processes are scheduled before the lowerones in a pre-emptive fashion. RT processes are assignedhigher fixed priorities whereas non RT processes areassigned lower dynamic priorities. As a result the RTprocesses are served before non RT processes and higherpriority RT processes are served before lower priority RTprocesses. This fixed priority mechanism provides aconvenient way to implement the rate monotonic RMalgorithm because the ordering of priorities between the RTprocesses depends on the ordering of the process rates thelength of their periods. Under this RM schedule the RTprocesses with smaller periods are executed first followedby RT processes with larger periods and then non RTprocesses [17]. Prior to the start of any RT process theschedulability test is performed by checking its total CPUdemand so that including this new process the CPU resourceallocation will not exceed the CPU capacity. But thisschedule has many problems described the followingparagraph.Priority should represent the importance of aprocess rather than whether this process is RT or non-RT.For example, a user giving command for printing RTapplication more important than other user giving commandfor trip a breaker non RT application RT and non RTapplications must share the CPU time fairly. It is alsounreasonable to assign priority for RT applications based onthe length of their periods. This is called the fairnessproblem. There is no mechanism to meet the deadline andprevent overrunning and observe monopoly of CPU by afaulty RT process at high priority because it does notprovide any protection between applications. Frequent overruns from a high priority process can cause massiveinterruption to other processes at lower priority. A RTprocess at very high priority can even block most of thesystem processes and lock up the entire system. This iscalled the enforcement problem. Root privilege is requiredto run the application under fixed priority. However whenthe UNIX security is in concern, it is impossible to giveevery user root privilege to run RT applications. This iscalled the security problem.UNIX is an operating system created in the earlydays of computers. More recently, Linux was created as anopen-source, freeware operating system. Hard RT-Linux canmake a system to guarantee that predefined response timesto certain hardware events to execute by activating kernelpre-emption. RT-Linux can use UNIX constructs and alsodeparts from traditional UNIX. RT-Linux is faster thanmany of the other commercially available operatingsystems. It appears to also be far more robust than UNIX,DEC OSF/1. Linux is being used in many time criticalapplications because of its speed [10]. SCADA can be moreefficient with RT-Linux without any major modification intheir setup because it is used in many applications that needto maintain uptime as Linux, like UNIX, can run for monthsat a time without rebooting.The reason why SCADA experts over a decadeearlier had set the response time up to 30ms was becausethey found this time is a minimum to operate any powerelements consistently. A soft real time OS like UNIX cannot guarantee the system to execute in time because in caseof soft Real-time the timing constrains are only failed invery rare cases which reflects the similar problem with theDPDC (former DESA) SCADA while operating thesubstation’s RTUs. While operating substations from themaster control some delay can be occurred as it wasdemonstrated earlier part of this paper due to multitaskingenvironment which particularly in this experiment we found14000ms. This 14000ms however, can be minimized byhard RT Linux operating system and confined the minimumdesired loop time fewer than 2000ms by stopping all othernon RT task during prioritize command under execution.As microwave travels with the speed of light than itcan be anticipated, the approximate distance of 70km fromDhanmodi CC to Narsingdi CL3 2/3 for example: it takesmaximum of 0.233 ms which is less than one percent ofentire time taken by the power component tripping process,and the relaying the signal, the transponders also do notcontribute much time comparing to 30ms time frame if wetake the maximum measurement of time taken by the mastercontrol to trip the breaks of Jigatala substation mentionedearlier in this paper. The signal transmission time isrelatively negligible to the defined time span by theoperating system and during the establishment of DPDC(former DESA) SCADA experts did not locate any otherdeviation with the system. There is a prospect to reduce thatresponse time or desired loop time introducing hard RToperating system, preferably RT Linux in DPDC SCADAsystem.Hard real time OS is positively faster and can meetthe requirements for SCADA state of the art technology ifthe SCADA OS is replaced by RT-Linux. This particular OScan introduce task priority as an extra feature to the SCADAsystem. This operation will first identify the priority andaccording to that the execution will be done. The aboveexperiments show the rise time of RT Linux is 1.56 timesfaster then any other soft real time OS. The faster rise offersthe faster response to the system, in this case the responsetime of the SCADA system, if the maximum time taken forbreaker tripping is taken into the account can be improvedby 1.56 times which allows the system to response inapproximately 18 ms and can improve the response time.Since RT Linux performs efficiently under multitaskingenvironment by restricting any other non RT tasks. RTLinux executes the prioritize task as nonpreemptable andstop any other task on the processor so that no jitter canoccur. Due to this specific feature of RT Linux themaximum delay time over the desired loop time can berecovered that can resist the delay operation of breakers on acertain instant. This is the radical improvement of DPDCSCADA system which our paper is suggesting.104


9 CONCLUSIONSCADA system is a communication and control systemused for monitoring, operation and maintenance of energyinfrastructure grids and at the same time system has a harshdeadline for critical tasks comparing with traditionalapplication. This paper has identified the motive of thefailure of consistent execution of SCADA powercomponents in different Remote Terminal Unit underspecific time limit and proposed RT Linux as a primaryOperating System for DPDC SCADA and with the help ofadvance communication system which will be concentratedon our further research to have better response time in termsof tripping or initiating any relays or breakers of differentfeeders or substations. These approaches will promote theexisting SCADA technology towards future developmentand offers the time precision of supervisory control so thatno command is failed to execute. RT Linux makes theSCADA system 1.56 times faster than ever before fromoperating system’s point of view which compensatesapproximately 13ms from the previous maximum time takenby the master control to operate breakers. The fiber opticsallows more security on communication by reducing theattenuation of the signal and data interception. The fastnessof the response time and priority task handling property ofthe hard Real-Time operating system (RT-Linux) willimprove the performance of the existing SCADA systemwhich can save more resources and money; moreover it willimprove quality and security of power distribution for newDPDC SCADA.10 ACKNOWLEDGMENTThis work has been conducted and supported by ControlApplication Research Group (CARG) of BRAC Universityunder the criterion of individual student research project.SCADA (Katabon) of Dhaka Power Distribution CompanyLtd has been tremendously supportive while conducting thisresearch. SCADA superintendent engineer, communicationengineer had extended their helpful and effectiveinformation and expert opinions in this regard. Theexperiment of comparison between hard and soft Real-TimeOperating System has been performed under ControlApplication Research Group (CARG) and a part or thatexperiment was provided by the members of CARG whichunquestionably assisted to conduct the research on SCADAsystem of Dhaka Power Distribution Company Ltd.11 REFERENCES[1] CIGRE Study Committee B5/WG07. “The Automation of New andExisting Substations: Why and How,” Report No: Draft Final Report,November 2002.[5] Gomaa Hamoud Rong-Liang Chen Ian Bradley, “Risk Assessment ofPower Systems SCADA” Power Engineering Society GeneralMeeting, 2003, IEEE Volume 2, Issue,13- 17 July 2003 Page(s): - 764Vol.2,Digital Object Identifier[6] I. Kaya, N. Tan & D. P. Atherton, “A simple Procedure for improvingperformance of PID Controllers”, <strong>Proceedings</strong> of IEEE Conference onControl Applications,2003, CCA 2003, Volume: 2, page(s): 882- 885.[7] C. M. Hussain, M. Alam and A. M. Azad, “Performance of PIDController both in Hard and Soft RTOS”, <strong>Proceedings</strong> of China-Ireland International Conference on Information and CommunicationTechnologies 2008, Beijing, China.[8] ABB Utility Vice President of export sales “Bangladesh can savemore money in power sector using SCADA system”, Daily star,April11 2003,[9] Allen Risley, Jeff Robertsand Teter Ladow,“Electronic Security ofReal- Time protection And SCADA Communications Presentedbefore the 5 th Annual Western power Delivery AutomationConference Spokane, Washington, April 1-3 2003.[10] http://wiki.answers.com/Q/What_is_the between_Linux_and_Unix[11] M. Alam, C. M. I. Hussain, M. Moniruzzaman, S. Chowdhury. “PIDController Of Servo System In Real-Time Linux Environment” AThesis Submitted to the Department of Computer Science andEngineering of BRAC University.[12] A. M. Azad, M. Alam and C. M. Hussain, “Delay Analysis ofSampled-Data Systems In Hard RTOS”, <strong>Proceedings</strong> of the 5 thInternationalConference on Control, Automation and Systems 2008,accepted to appear, Prague, Czech Republic.[13] A. M. Azad, T. Hesketh & R. Eaton, “Real-time ImplementationMulti-rate Sampling Systems in RT-Linux Environment”,<strong>Proceedings</strong> of The fourth International Conference on Control andAutomation, Montreal, Canada, 2003, pp.605-609.[14] http:// www.desa.com.bd[15] http://www.linuxfocus.org/English/May1998/article4.html[16] K. OGATA, “Modern Control Engineering”, Pearson education, Inc,2002.[17] Hao hua Chu, Klara Nahrstedt , “A Soft Real Time SchedulingServer in UNIX Operating System”, Volume 1309/1997, InteractiveDistributed Multimedia Systems and Telecommunication Services,4th International Workshop, IDMS '97 Darmstadt, Germany,September 10–12, 1997 <strong>Proceedings</strong>[18] C. M. I. Hussain, M. Alam and A. M. Azad,”PerformanceImprovement of DPDC SCADA System Using Hard Real-TimeOS”, The 2nd International Conference on Control, Instrumentationand Mechatronic Engineering CIM<strong>2009</strong>.[19] Ahmed Al Amin, Asaduzzaman All Faruk, Md. Asiful Alam, A. M.Azad, “Closed Loop Control Of Stepper Motor Using Real-TimeLinux In Case Of Multitasking”.[20] Technical Reference Peter Nordvall, BTA, Date 11-08-98, ABBNetwork Control, S.P.I.D.E.R. System Program Versions No: 1KSE6031-931 Reference L4654.1006, Page: 3(7).[2] Power Engineering Society Substations Committee I SubcommitteeC3/Task Force 1, “Recommended Practice for NetworkCommunication in Electric Power Substations,” IEEE Project P1615.[3] Hydro One Inc. “Transmission Control Room and SCADAStandard,” Hydro One Internal Report, February 3, 1999.[4] J. R. Davidson ,M. R. Permann ,B. L. Rolston ,S. J. SchaefferABB SCADA/EMS System INEEL Baseline Summary TestReport. November 2004105


HYBRID DECODING SCHEMES FOR TURBO-CODESShujun Huang ∗ , Yinwei Zhan †Faculty of ComputerGuangdong University of TechnologyGuangzhou 510006, China.Charith AbhayaratneDept. of Electronic & Electrical EngineeringThe University of SheffieldSheffield S1 3JD, United Kingdom.ABSTRACTFrequently used decoding algorithms for turbo decoding arethe maximum a posteriori (MAP) algorithm and the soft outputViterbi algorithm (SOVA). The Log-MAP algorithm is atransformation of the MAP algorithm in logarithmic domain.It shows the better decoding performance, while the SOVAhas the lower computational complexity. In this paper, wepropose hybrid turbo decoding schemes using the SOVA andthe Log-MAP algorithms. Both theoretical analysis and experimentalresults show that the proposed hybrid turbo decodinghas less computational complexity than that of the Log-Map turbo decoder, and better decoding performance thanthat of the SOVA turbo decoder. The decoding performancecan be close to that of the Log-Map turbo decoder with a smallnumber of iterations.Index Terms— Turbo-codes, SOVA, Log-MAP, hybriddecoding schemes1. INTRODUCTIONIn wireless mobile communication systems, powerful channelcoding is essential in order to obtain sufficient receptionquality. In 1993, Berrou et al. [1] proposed turbo-codeswhich take good advantage of random codes and the decodingcondition of Shannon’s noisy channel coding theorem toacquire good performance close to the Shannon’s limit. Thisturbo-code integrates an interleaver and convolutional codes,so that, it can realize random coding and improve the low codeweight. It adopts iterative decoding, presented in [2] et al., toapproach the maximum likelihood decoding. In [3], X. Qi etal. proposed a new iterative decoding scheme. Daneshgaranet al. [4] have presented a systematic method for design ofinterleavers. Perez et al. [5] proved that the performance ofturbo-code can be improved by increasing the length of theinterleaver. According to different design philosophies, interleaverscan roughly be divided into two types: regular andrandom. A regular interleaver usually realizes interleaving on∗ The authors would like to express their sincere gratitude to Mr. Jun Yangand Mr. Jiajun Wen for their comments.† Correspondence author: ywzhan@ieee.org. The work of Yinwei Zhan issupported by Natural Science Foundation of China (grant no. 60572078).the basis of a certain rule. Thus, it can be easily realized. Butfor a random interleaver, it is difficult to realize deinterleaver.The maximum a posteriori (MAP) algorithm and the Softoutput Viterbi algorithm (SOVA) have become major decodingalgorithms for turbo decoding. For example, Bahl algorithmis a MAP decoding method which minimizes the probabilityof the symbol error (or bit) for convolutional codes[6]. On the other hand, the SOVA is proposed on the basisof a Viterbi algorithm minimizing the probability of worderror. It also suits for convolutional codes [7]. The Log-MAP algorithm is a variation of the MAP algorithm in logarithmicdomain and yields lower computational complexitythan that of the MAP algorithm and results in nearly the samedecoding performance as that of the MAP algorithm. Comparedto the Log-MAP algorithm, the SOVA is lower in computationalcomplexity, but much worse in decoding performance.The complexity comparisons between the two algorithmshave been presented in [8].In this paper, we propose hybrid decoding schemes forturbo-codes by using the Log-MAP algorithm and the SOVA.This paper is organized as follows. In §2, we briefly reviewthe structure of the turbo encoder, the decoder, the soft inputand soft output (SISO) decoders based on the Log-MAPalgorithm and the SOVA. Then in §3, we present our hybriddecoding scheme and its analysis. Experimental results areshown in §4 for comparing the hybrid decoding schemes withthe original individual schemes, followed by conclusions in§5.2. PRELIMNARIESA turbo encoder uses a structure of parallel concatenated convolutionalcodes. Typically, it is at least composed of an interleaverand two recursion system convolutional (RSC) encodersas shown in Fig. 1. For an input bit {d k }, the turboencoder output is {Xk s} = {d k}, {X 1pk} and {X2pk}, wherethe last two components are the outputs of RSC encoders.A turbo decoder, as shown in Fig. 2, mainly consists oftwo SISO decoders, interleavers and deinterleavers. For anAWGN channel and a binary modulation, the turbo decoderinput comprises three random variables yk s , y1p kand y2pkwith106


y 1pky s kInterleaverRSC encoderRSC encoderX s kX 1pkX 2pkFig. 1. The structure of the turbo encoder.L a(u k )L e(û k )DeinterleaverL a(u n)SISOdec. 1InterleaverSISOdec. 2Interleavery 2pkL e(û n)û nFig. 2. The structure of the turbo decoder.where σ k is the state set at the time k and the P(u k ) is the apriori probability of bit u k .The Log-MAP decoder can avoid calculating actual probabilitiesby using the logarithm of probabilities and the approximationmax ∗ (x, y) ≈ ln(e x + e y ). If the transition betweens ′ and s exists, (1) can be updated asα ∗ k+1(s) = max ∗s ′ ∈σ k[γ∗k (s′ , s) + α ∗ k (s′ ) ] ,β ∗ k (s′ ) = max ∗s∈σ k+1[γ∗k (s′ , s) + β ∗ k+1 (s)] ,γ ∗ k (s′ , s) = u kL a (u k )2+ L c2 r k · v k ,where v k = (u k , p k ), p k = 2X p k − 1, Xp k = X1p kor X2pk , r kis the reception in response to v k , and L c = 4E s /N 0 is thereliable factor of the channel. Based on the above analysis,the output of a Log-MAP decoder can be written astime k corresponding to the three output bits from the encoder.Let u k = 2X s k − 1, û k be the bit decision of the SISOdecoder, L e (û k ) is the extrinsic information (EI) generatedby one SISO decoder which does not depend on the decoderinput y s k , and L a(u k ) be the priori information (PI) associatedwith the EI generated by the other SISO decoder in the turbodecoder.2.1. The Log-MAP decoderAn optimum soft output decoder should adopt a posterioriprobability. The MAP algorithm based on codes trellis canminimize the bit error probability. The output of a MAP decoderfor a transmitted “±1” in the information sequence isdefined as a posteriori log-likelihood ratio (LLR),L(û k ) = ln p(u ∑k = +1|r)p(u k = −1|r) = ln (s ′ ,s),u k =+1 p(s′ , s, r)∑(s ′ ,s),u k =−1 p(s′ , s, r) ,where r is the reception of the MAP decoder and the bit u kis associated with the transition from time k − 1 to k. Theindexes s ′ and s correspond to the trellis states at times k − 1and k, respectively The transition from s ′ to s arises from theinformation bit u k .The Bayesian principle implies thatp(s ′ , s, r) = p(s ′ , s, r tk )= p(r t>k |s)p(s, r k |s ′ )p(s ′ , r t


L a(u k ) SOVA/Log-MAPLog-MAP Interleaver /SOVA L e(û n)y kDecoderDecoderFig. 3. The succeeding structure of a hybrid turbo decoder.Table 1. The computation complexity comparison of the decodingschemesOperation LM-LM SV-SV SV-LM / LM-SVmax ops 10A − 4 2A + 6v + 6 6A + 3v + 3additions 30A + 18 4A + 16 17A + 17multi by ±1 16 16 16look-ups 10A − 4 0 5A − 2bit comps 0 12v + 12 6v + 6updating the reliable value of the survivor path with backwardtracking and according to the inference in [2], the softoutput (or LLR) of the bit decision of the SOVA decoder canbe written asL(û k−m ) ≈ û k−m min ∆ k−j(S i ),j=0,··· ,mwhere û k−m is the bit decision at time k − m.3. HYBRID TURBO DECODING SCHEMESBased on the above analysis of the Log-MAP decoder andthe SOVA decoder, both of them can output the LLR, L(û k ),of the bit decision, so the two SISO components of a turbodecoder can be different.The succeeding structure of a hybrid turbo decoder, consideringtwo SISO decoders can be shown as in Fig. 3. Inthe conventional schemes, The SISO decoders are either theLog-MAP only (LM-LM) or the SOVA only (SV-SV). In thehybrid case, we propose to use the Log-Map decoder in oneSISO decoder and the SOVA in the other or vice versa (LM-SV or SV-LM). Table 1 shows the computation complexitycomparison of the decoding schemes for a single iteration. InTable 1, eachquantity need to be multiplied by n which is thelength of the information sequence and A = 2 v , where v isthe count of registers of one component encoder. Accordingto this analysis, when the v = 2, the number of operations ofthe Log-MAP turbo decoder is the highest out of all these decodingschemes. The number of its operations is about twicethat of the SOVA turbo decoder. The number of operationsof the hybrid turbo decoder is the compromise of that of theLog-MAP turbo decoder and that of the SOVA turbo decoder.Thus, it has less computational complexity than that of theLog-Map turbo decoder.As regards to the hybrid turbo decoders, the soft output ofeach component decoder can be represented asL(û k ) = L c · y k + L a (u k ) + L e (û k ).a1 a2 a3 a4 a5 a6 a7 a8a9 a10 a11 a12 a13 a14 a15 a16a17 a18 a19 a20 a21 a22 a23 a24a25 a26 a27 a28 a29 a30 a31 a32a33 a34 a35 a36 a37 a38 a39 a40a41 a42 a43 a44 a45 a46 a47 a48a49 a50 a51 a52 a53 a54 a55 a56a57 a58 a59 a60 a61 a62 a63 a64The first sequenceThe second sequenceFig. 4. The structure of a 8 × 8 sample for the interleaver.In the process of iterative decoding, the EI can be solved byL e (û k ) = L(û k ) − L c · y k − L a (u k ). Then L e (û k ) is exchangedbetween the two component decoders so that theycan take advantage of the EI of each component decoder. Providedthat L e (û k ) is generated by the SOVA decoder, L e (û k )with interleaved can turn into the PI L a (u n ) of the Log-MAPdecoder. L e (û n ) of the Log-MAP decoder becomes the feedbackfor the SOVA decoder after deinterleaving and appliedto the next iteration. This iterative mechanism improves thedecoding performance.4. EXPERIMENTAL RESULTSWe evaluate the performance of turbo decoders describedabove considering AWGN channels. For this, we choose theinterleaver of a regular spiral structure, which can be easilyrealized. In the interleaving process, the bits data are writteninto an m × n matrix row wise and then read from the upperleft corner to the lower right corner. Conversely, the processof the deinterleaving is that the bit data are written into anm × n matrix from the upper left corner to the lower rightcorner and then read row wise. Fig. 4 shows the organisationstructure of 8 × 8 samples for the interleaver. In this way, theinput neighboring information bits can be far away from eachother after interleaving, so that it can certainly improve theperformance of the turbo code. In general, while the interleaveris large enough, an input sequence interleaved can beconsidered as a random code.The experiments are conducted in a system with the 2.66GCPU of Pentium 4, 512M memory of DDR and the C++ programminglanguage. We also choose the (7, 5) RSC codeas the component code of the turbo encoder. The decodingdepth of the SOVA decoder is 18. The number of iterationsis 10. For the interesting range of SNRs ( E b /N 0 =0.0dB, · · · , 1.5dB), Fig. 5 and Fig. 6 show the comparison108


BER10 −1 10 −20.0 dB0.5 dB10 −3 1.0 dB1.5 dBLM−LMSV−SVLM−SVSV−LM10 −41 2 3 4 5 6 7 8 9 10IterationsFig. 5. The comparison of the decoding performance for thedecoding schemes mentioned with the coding rate R = 1/3.10 −2Table 2. The comparison of the cost for the decoding schemeswith 16384 information bits, R = 1/3SNR LM-LM SV-SV SV-LM LM-SV0.0 dB 5.094s 0.813s 2.938s 3.031s0.5 dB 5.109s 0.813s 2.969s 2.938s1.0 dB 5.079s 0.797s 2.953s 2.921s1.5 dB 5.094s 0.813s 2.969s 2.968s5. CONCLUSIONIn this paper, we have proposed the hybrid turbo decodingschemes using the Log-MAP algorithm and the SOVA. Theoreticalanalysis on the number of their operations proves thatthe hybrid turbo decoders consist of fewer number of operationsthan those of the Log-MAP turbo decoder. Based onthe experiments, we can draw the conclusion that their decodingperformances are superior to those of the SOVA turbodecoder, and even close to the Log-MAP turbo decoder usinga small number of iterations. Furthermore, their costsare lower than the Log-MAP turbo decoder. The proposedschemes would contribute to an effective strategy to wirelesscommunication.10 −1 IterationsBER0.0 dB0.5 dB10 −31.0 dB1.5 dBLM−LMSV−SVLM−SVSV−LM10 −41 2 3 4 5 6 7 8 9 10Fig. 6. The comparison of the decoding performance for thedecoding schemes mentioned with the coding rate R = 1/2.of the decoding performance for the decoding schemes mentionedwith 16384 information bits with the coding rate R =1/3 and 1/2 respectively.At R = 1/3, for the decoding performance, the SOVAturbo decoder is the worst while the Log-MAP turbo decoderis the best. The two hybrid turbo decoders perform nearly thesame in decoding. At previous times, their BERs are about themean of those of the Log-MAP turbo decoder and the SOVAturbo decoder. However, their BERs can be even close to thatof the Log-MAP turbo decoder with a small number of iterations.At 0.5dB, they need about eight times iterations toapproach the BER of the Log-MAP turbo decoder. At 1.5dB,they need about five times iterations to approach the BER ofthe Log-MAP turbo decoder. The higher the SNR (E b /N 0 )is, the fewer iterations they need to approach the BER of theLog-MAP turbo decoder. Similar performance are seen forR = 1/2. Table 2 shows the comparison of the cost for thedecoding schemes with 16384 information bits.6. REFERENCES[1] C. Berrou, A. Glavieux, and P. Thitimajshima, “Near shannonlimit error-correcting coding and decoding: Turbo-codes.1,” IEEE ICC 93, pp. 1064–1070, vol. 2, 1993.[2] J. Hagenauer, E. Offer, and L. Papke, “Iterative decoding ofbinary block and convolutional codes,” IEEE Transactions onInformation Theory, vol. 42, no. 2, pp. 429–445, Mar 1996.[3] X. Qi, M. Zhao, S. Zhou, and J. Wang, “An iterative decodingscheme with turbo code and iteratively demapped multidimensionalQPSK serially concatenated,” IEEE GLOBECOM’05, vol. 3, Nov 2005, [CD-ROM].[4] F. Daneshgaran and M. Laddomada, “An improved interleaverdesign technique for parallel concatenated convolutionalcodes,” IEEE ICC 03, pp. 3100–3104 vol. 5, 2003.[5] L.C. Perez, J. Seghers, and Jr. Costello, D.J., “A distance spectruminterpretation of turbo codes,” IEEE Transactions on InformationTheory, vol. 42, no. 6, pp. 1698–1709, Nov 1996.[6] L. Bahl, J. Cocke, F. Jelinek, and J. Raviv, “Optimal decoding oflinear codes for minimizing symbol error rate (corresp.),” IEEETransactions on Information Theory, vol. 20, no. 2, pp. 284–287, Mar 1974.[7] J. Hagenauer and P. Hoeher, “A viterbi algorithm with softdecisionoutputs and its applications,” IEEE GLOBECOM ’89,pp. 1680–1686 vol.3, Nov 1989.[8] P. Robertson, E. Villebrun, and P. Hoeher, “A comparison ofoptimal and sub-optimal MAP decoding algorithms operatingin the log domain,” IEEE ICC 95, pp. 1009–1013, vol. 2, 1995.109


JavaScript code Fragment Analyzingfor Cache ProxyYachao Zhou 1 , Xiaofei Wang 1 , Yi Tang 2 , Jieyan Yang 3 and Xiaojun Wang 11 School of Electronic Engineering, Dublin City University, Ireland2 Department of Computer Science and Technology, Tsinghua University, Beijing, PRC3 China Unicom System Integration Limited Corporation, PRCEmail: yachao@eeng.dcu.ieAbstract—The JavaScript language is used to enhancethe client side display of web pages. Programs in JavaScriptare deployed in HTML documents and they are interpretedby web browsers on the client machine, helping to makeweb pages richer, more dynamic. JavaScript is widely usedin most popular websites, such as CNN, a famous newswebsite in the world. We calculate the percentages of twopopular JavaScript fragments and propose a proxy basedscheme to reuse the same JavaScript fragments efficiently.The proxy could extract the same JavaScript codefragments as the JavaScript library and detect whetherthere are malicious codes in it.Key Words—JavaScript, cache proxy, web security.I. INTRODUCTIONWeb browser`s role is becoming more and more importantday by day. Web application can be broadly categorized intostatic web application and dynamic web application. Static webapplications are those that display the information to the userand dynamic web applications accept input from the user andperform actions based on the input. It is not anymore just a toolfor accessing static HTML-pages but today`s platform for RichInternet Applications (RIA). Web2.0 makes browsingexperience more and more like using a traditional desktopapplication. Since business needs are growing, the hypertextmark-up language continues to grow to meet the needs ofbusiness with the new, powerful, and exciting tags.Most web applications employ a more distributed model, inwhich the client side is used not simply to submit user data andactions but also to perform actual processing data. This is donefor two primary reasons:(1) It can improve the application`s performance, becausecertain tasks can be carried out entirely on the client component,instead of making a round trip of request and response to theserver.(2) It can enhance usability, because parts of the userinterface can be dynamically updated in response to useractions, without the need to load an entirely new HTML pagedelivered by the server.Larbin is a web crawler, which is also called (web) robot,spider, or scooter etc. It is intended to fetch a large number ofweb pages to fill the database of a search engine. With anetwork fast enough, Larbin should be able to fetch more thanThis work is supported by Enterprise-Ireland.1110100 millions of pages on a standard PC.In this paper, we use the web crawler Larbin to fetch somepopular web sites and analyze the detailed information of someJavaScript fragments which we care about. Our contributionsare listed as follows:• Analyze some kinds of JavaScript codes fragmentdistribution in some popular website.• Propose a proxy to extract the JavaScript codes fragmentand rewrite the HTML pages.• Enhance the efficiency of JavaScript code libraries viacache proxy engine.This paper is organized as follows. The related works ispresented in section II, the methodology is provided in sectionIII. Section IV is the conclusion.II. RELATED WORKSEmre [2] presents AjaxScope, a dynamic instrumentationplatform that enables cross-user monitoring and just-in-timecontrol of web application behavior on end-user desktops.AjaxScope enables a large number of exciting instrumentationpolicies, such as: performance, runtime analysis and debugging,usability evaluation. It needs not to change anything of the webbrowsers. The AjaxScope proxy only dynamically rewrites theuninstrumented JavaScript code according to theinstrumentation policies.NeatHtml [3] and Caja [4] take an alternate approach byperforming parsing of untrusted HTML using a client-sidetrusted JavaScript library. After parsing, untrusted content isfiltered in a series of steps to ensure that it is free of scriptcontent. They both embed untrusted HTML in the web pageusing trusted client side JavaScript code via the innerHTMLDOM property (for Caja this is a deliberate optimizationchoice), which protects against node-splitting attacks.Proxies have been used to introduce new services betweenWeb clients and servers. For example, they have been used toprovide Web caching [5,6], and gateway services foronion-routing anonymizers [7]. In this paper, we propose to useproxy as a combination of blacklisting, whitelisting javascriptfilter and heuristics to identify potentially malicious Webcontent.Several pieces of previous work [8,9], have used rewriting atthe Java Virtual Machine bytecode interface [10]. Thisinterface is type-safe, and provides good support for reasoningabout application-internal abstractions.In this paper, we use the web crawler to fetch some popularweb sites and analyze the detailed information of someJavaScript fragments. With the knowledge of JavaScript codes


fragment distribution of some popular websites, we furtherpropose to use whitelist and blacklist to help the filter thepotential malicious html pages.III. METHODOLOGYTraditionally a web page was loaded once and when thecontent of page was to be updated, the whole page had to beloaded again. Asynchronous JavaScript + XML (Ajax) is atechnology that allows content being updated asynchronouslywithout loading the whole page. XmlHTTPRequest is an APIthat enables connections to remote sources via HTTP fromclient side. It is a key part of Ajax. Usually the transfer formatused is XML, JavaScript Object Notation (JSON), HTML orplain text. In Ajax-applications JavaScript is used on client sideto make connections and gather data by XMLHTTPRequestand then to modify the page by accessing the Document ObjectModel and CSS style sheets of the page. By using Ajax the datacan be exchanged in small amounts with server and the user isprovided an experience more and more like a desktopapplication without loading the whole page again. These kindsof RIAs seem to be an essential part of growing amount ofWeb2.0 applications.JavaScript is a relatively simple but powerful programminglanguage that can be easily used to extend web interfaces inways that are not possible using HTML alone. It is commonlyused to perform the following tasks:• Validating user-entered data before it is submitted to theserver, to avoid unnecessary requests if the data contains errors.• Dynamically modifying the user interface in response touser actions; for example, to implement drop-down menus andother controls similar as non-web interface.• Querying and updating the document object model withinthe browser to control the browser`s behavior.JavaScript codes and build a JS parse tree if there is anoccurrence of JavaScript code in the HTML DOM tree. Afterthe execution in client side or communication with remoteserver side via Ajax in JavaScript Runtime Environment, arender tree is generated as the basic of web view. If theJavaScript is too complicated, the browser could not display thewhole page before all the JavaScript has been executed.Obviously if two web pages contain a slice of same JavaScriptfragment, we could reuse them after download the JavaScriptfragment first time so as to save time.Figure 2 Generalized functional diagram of existingbrowsers’ HTML interpretation process.The web crawler Larbin is used to fetch web pages fromCable News Network (CNN.com) which is one of the mostfamous news websites. We fetched huge number of web pageswithout considering files extension of .pdf, .tar, .mp3, etc. Afterfetching web pages of CNN from March 1 st to March 6 th , we gotapproximate 102,859 files of 52 folders. Then we calculated thepercentage of two popular kinds of JavaScript code fragments,“


The external JavaScript itself, which is simply a text file withthe containing JavaScript code, saved as a .js file.A tag refers to the external JavaScript file and isdefined on the pages that use the library.Similarly fragment “javascript:” also has high percentage ofmultiple codes. So we can draw the following conclusionsaccording to the statistics:1. Unique script code segment is far less than the script codesegments. So website could reuse script code in the proxyclient so as to download the same JavaScript fragment indifferent web pages.2. JavaScript codes do not change frequently even if the webpages are dynamically changed, such as the index page of CNNand some portal pages.3. JavaScript code of big size has less modificationprobability. Actually the editor seldom updates the JavaScriptcodes after they submit them into the server.Most programming languages provide libraries, such as C orJava support “code libraries”, where a programmer can save acommonly used piece of code as a library file and refer to itfrom the main program. For example, Software developersgenerally use .jar files to distribute Java classes and associatedmetadata. JavaScript supports external libraries too, in the formof the .js file.While obviously the main reason for using JavaScriptlibraries is that it allows you to easily distribute one code forusers on many pages, a secondary reason is not. A JavaScriptlibrary, when used on multiple pages, is actually more efficientthan directly embedding the code inside the library on eachpage. Once the browser in client side encounters the library thefirst time around, it saves the containing code in cache.Subsequent requests for the library on other pages will result inthe library being loaded from cache, or the speed demon.Figure 3 Cache server-side proxy for popular websitesAs shown in Figure 3, we use a proxy server to extract theJavaScript code fragment, rewrite to a library out of it, somultiple pages can share a same JavaScript code librarywithout physically including the original code on those pages.By including the above reference, the client side browserwill now download the code stored inside external .js file, andrun it as if the code was physically typed onto the original webpage.At the same time, in our proxy engine, we detect theJavaScript library to see whether there are malicious codes in it.Since the JavaScript library does not often change, we caneasily create a Black-list for the malicious libraries. If viruscodes have been found, the proxy engine will stop parse theJavaScript codes and alert to the client users.IV. CONCLUSIONIn this paper, we analyze the percentage of two popularkinds of JavaScript code fragments of popular website. Wefound that there are approximate 95.2% multiple JavaScriptcode fragment according to the statistics. A cache proxy isproposed to extract the same JavaScript fragments toexternal .js JavaScript code libraries to save the expensiveruntime waste.V. FUTURE WORKAs an on-going research, we plan to make the proxy work asa special sand-box which execute the JavaScript instead ofclient side and send back the result codes to client browser.The proxy must be authenticated by client user to get thesensitive information, such as cookies.REFERENCES[1] http://larbin.sourceforge.net/index-eng.html[2] Emre Kıcıman, “AjaxScope: A Platform for RemotelyMonitoring the Client-Side Behavior of Web 2.0Applications”, Oct. 2007, SOSP`07[3] D. Brettle, “NeatHtml: Displaying untrusted contentsecurely, efficiently, and accessibly,” Jun. 2008, white3112


paper. [Online]. Available:http://www.brettle.com/NeatHtml/docs/Fighting_XSS_with_JavaScript_Judo.html[4] Google Caja, “A source-to-source translator for securingJavaScript-based web content.” [Online]. Available:http://code.google.com/p/google-caja/[5] Alec Wolman, Geoff Voelker, Nitin Sharma, NealCardwell, Anna Karlin, and Henry Levy. On the scaleand performance of cooperative Web proxy caching. In<strong>Proceedings</strong> of the 17th ACM Symposium on OperatingSystems Principles (SOSP ’99), Kiawah Island, SC,December 1999[6] Brian Duska, David Marwood, and Michael J. Feeley.The measured access characteristics of World WideWeb client proxy caches. In <strong>Proceedings</strong> of the 1stUSENIX Symposium on Internet Technologies andSystems (USITS 97), Monterey, CA, December 1997.[7] Roger Dingledine, Nick Mathewson, and Paul Syverson.Tor: The second-generation onion router. In <strong>Proceedings</strong>of the 13th USENIX Security Symposium, San Diego,CA, August 2004.[8] U. Erlingsson and F. B. Schneider. IRM Enforcement ofJava Stack Inspection. In IEEE Symposium on Securityand Privacy, 2000.[9] U. Erlingsson and F. B. Schneider. SASI Enforcement ofSecurity Policies: A Retrospective. In WNSP: NewSecurity Paradigms Workshop, 2000.[10] T. Lindholm and F. Yellin. The Java Virtual MachineSpecification, 2nd edition, 1999.4113


Section 3ACHANNELS AND PROPAGATION114


Research on the Gain flatness of Fiber-Optic Parametric Amplifier withPeriodic Dispersion CompensationJing Jin, Qiliang LiInstitute of Communication and Information System, College of Communications, HangzhouDianzi University, Hangzhou 310018, Chinakimjung2@yahoo.cnAbstractWe study the gain flatness in a two-pump fiberoptical parametric amplifier with periodic dispersioncompensation. The results show that the gainfluctuation of the two-pump fiber-optic parametricamplifier have been improved obviously by insertingdispersion compensating fibers at regular intervals in ahigh-nonlinear fiber. It is found that the gain have arelationship with the length of the high-nonlinear fiber.With the increase of the length of the high-nonlinearfiber, the parametric gain is raised, but the gainflatness is decreased. The parametric gain can beflattened, by inserting more dispersion compensationfiber segments.Keywords: fiber optical parametric amplifier; highnonlinear fiber; dispersion compensation fiber; gainflatness1. IntroductionFiber-optic parametric amplifiers (FOPAs), relyingon four-wave mixing (FWM), have attractedconsiderable attention in recent years. FOPAs offersome unique properties when compared to othertraditional amplifiers, such as Erbium-doped fiberamplifier [1], [2], FOPAs amplify at any arbitrarywavelengths in theory and offers high gain, largebandwidth as well as low noise figure [3].FOPAs can be classified into two categories of oneandtwo-pump. Owing to the phase-matching conditionof the underlying FWM process, single pump FOPAsgenerally exhibit a lager gain fluctuation in the zerodispersionwavelength (ZDWL) region of the opticalfiber [4]. Based on this limitation, the concept of twowavelengthpumping has been demonstratedtheoretically in Ref. [5], and has been shown that thetwo-pump FOPAs can provide a gain spectrum that isrelatively uniform over a bandwidth larger than 100nm.The phase-matching plays an important role in theFWM process; it is the key to achieve efficiently FWM.Laurent et al. proposed a scheme, based on amultisection dispersion-tailored in-line nonlinear fiberarrangement, allows for the achievement of over 100nm flat gain bands with a ripple of less than 0.2 dB inthe ZDWL region [6]. Marhic et al. suggested a method,in which meltisection dispersion compensating fibers(DCFs) are inserted in highly nonlinear fiber, achieveflat gain over ultrabroad bands with a single pumpfiber-optic parametric amplifier operating in the ZDWLregion [7].Pumping in the vicinity of ZDWL or using highnonlinearfibers (HNLF) has been proposed to enhancethe gain [7], [8]. Unfortunately, the unavoidablefluctuation of ZDWL brings to detriment the near-ZDWL pumping scheme [9]. This gain fluctuation canbe avoided by increasing the separation between thepump wavelengths and the ZDWL (>10nm) [9], whichleads the dispersion to be larger. Therefore thedispersion must be compensated to realize the quasiphasematching, and obtain flat gains over a widespectral rang. Marhic et al. have shown that the onepumpFOPA can provide a higher gain and largerbandwidth by inserting meltisection DCFs in Ref. [7].In this paper we focus on the quasi-phase matching oftwo-pump FOPAs that flatten efficiently the gainspectrum in the two-pump FOPA.2. Theoretical modelWe consider a length of HNLF L, consisting of( m + 1) segments of equal length l = L / ( m + 1) ,connected by m pieces of DCF, each of length l′ , Fig. 1.115


The DCF is assumed to have negligible nonlinearity.We also neglect attenuation in both HNLF and DCF;this is a very good approximation for short fiber lengths,as are used for high-power pulsed pumps. Thepropagation constants in the HNLF and DCF arerespectively β(ω) and β′(ω).We consider two pumps, asignal, and an idler, with respective angular frequenciesω 1 , ω 2 , ω 3 , and ω 4 , and electric field E 1 , E 2 , E 3 , and E 4 ;all fields are in the same state of linear polarization.The total electric field E can be written as 41E = x ˆ ∑ Ej exp( iωjt - kjz ) + c . c , ( 1 )2j = 1where kj= njωjc is propagation constant,where njis the index of refraction.Figure 1. Schematic of high-nonlinear fiber with periodicdispersion compensation.Using the Maxwell equation, the differentialequations governing of pumps, the signal, and idler inthe (m+1)th HNLF segment aredE1, m+1 m= iγα( P1 + 2P2 ) E1, m+1dz( 2a)dE2, m+1 m= iγα( 2P1 + P2 ) E2, m+1dz( 2b)dE3, m+1m= 2iγα( P1 + P2 ) E3, m+1dz+ iγE E*E exp -i∆βz 2cdE4, m+1dz1, m+ 1 2, m+ 1 4, m+1( )m= 2iγαP + P E1 2 4, m+1*1, m+ 1 2, m+ 1 3, m+1( ) ( )( β ) ( )+ iγE E E exp - i∆z , 2dwhere: E j is the envelope of the electric field; z is thedistance from the beginning of the first HNLF segment;β 2 is the second-order dispersion coefficient and γ is thenonlinear parameter; P j =|E j,1 (0)| 2 is the pump power atz=0; α=10 -0.2Ls is the transmittance through two splices;L S is the power loss, in dB, of a single splice. We canexpand the linear phase mismatch ∆β=β(ω 3 )+β(ω 4 )-β(ω 1 )- β(ω 2 ) in a Taylor series [10]∞2m1 ⎛ d β ⎞2m2m∆ β = 2∑⎜ ( )2 3- -m ⎟ ⎡ ω ω ω ⎤cdm=1 ( 2 m)! dω⎣ ⎦⎝ ⎠ω = ωc( ωc) ( )2 β2 44 4≈ β2 ( ω ) ⎡c ( ω3 − ωc ) − ω ⎤ ⎡ ωd 3− ω − ω ⎤⎣ ⎦+c d12 ⎣ ⎦22( ω )( 2π c λ ) ⎡( )2 2= β2 c0λ3− λ − λ ⎤⎣c d ⎦24( )( c ) ⎡( − )+ β ω π λ⎤⎦4 44 c20λ3λ − λ 12,c d⎣(3)here: ω c =(ω 1 +ω 2 ) ⁄2 is the mean frequency of the twopumps and ω d =(ω 1 -ω 2 ) ⁄2 is half their difference; ω 1 andω 2 are the two pump frequencies, ω 3 is the signalfrequency; β 2 (ω c ) and β 4 (ω c ) are respectively thesecond-, fourth-order dispersion coefficient at ω c ,where β 2 (ω c )=β 2 (ω 0 )(ω c -ω 0 )+β 4 (ω 0 )(ω c -ω 0 ) 2 ⁄2, β 4 (ω c )≈β 4 (ω 0 ), ω 0 is the zero-dispersion frequency; λ 0 is thezero-dispersion wavelength; λ 3 is signal wavelength;λ c =(λ 1 +λ 2 ) ⁄2 , λ d =(λ 1 -λ 2 ) ⁄2 , where λ 1 and λ 2 are thepump wavelengths.3. Gain of fiber-optic parametric amplifiersWe assume that the pumps are not depleted by thenonlinear process, thus the solutions of (2a) and (2b)are( )m⎧⎪E1, m+1= E′ 1, mexp ⎣⎡iγαP1 + 2 P2z⎦⎤ ,⎨(4)m⎪E2, m+1= E′ 2, mexp ⎡⎣iγα( 2 P1 + P2) z⎤⎩⎦,where, E′ 1, m and E′ 2, m are the pump field phasors at theoutput of the mth DCF segment. (4) have included thecontribution of self-phase modulation (SPM) and crossphasemodulation (XPM) induced by the two pumps.The optical field amplitudes for the signal and idlerwaves, E j , are related to H j by a phase factor throughmEj= H exp ⎡j ⎣2iγα( P1 + P2) z⎤⎦ . (2c) and (2d) can bewritten asdHj , m+1 m= iγαPP ( ) ( )*1 2exp -iκmz exp iΦ mH7- j, m+1,dzj=3,4 m=1,2… (5)m-1mp1-αwhere, Φm= 3γ ( P1 + P2 ) ∑ α l = 3γ( P1 + P2) l ,1-αp=0mκm = ∆ β + γα ( P1 + P2) describes the total phasemismatch. Using the boundary conditions, we canobtain the solution of (5), which can be written in theform of matrix⎛ H3, m+1 ( l)⎞ ⎛ H1/ 23, m ( l)⎞⎜ * α Mm *H4, m+1 ( l)⎟=⎜ H4, m ( l)⎟, (6)⎝ ⎠ ⎝ ⎠where, M m is the transfer matrix of the (m+1)th HNLFsegment,⎛ m33 m34⎞Mm= ⎜ ⎟ , (7)⎝ m43 m44⎠where,iκml-2 iψ⎛iκ⎞mmm33 = e e ⎜cosh ( gml) + sinh ( gml)⎟ ,⎝2gm⎠iκmlmiΦm-2 -iψiγαPPm1 2em34 = e e sinh ( gml) ,gm116


iκmlm-iΦm-2 iψiγαPPm1 2em43= e e sinh ( gml) ,giκml2 -iψ⎛iκ⎞mmm44 = e e ⎜cosh ( gml) - sinh ( gml)⎟ ,⎝2gm⎠m-1where, ( )mψ = 2γα + + ∆ β ′ ′ 2 , m=1,2…; g mmP1 P2l lis the parametric gain coefficient of the (m+1)th HNLF= γ α - / 21 2κm.2 msegment are given by g PP ( ) 2mThe transfer matrix M for the whole arrangement isM=M 0 M 1 … M m . (6) can be written as⎛ H3, m+1 ( l)⎞ ⎛ H/23 ( 0)⎞m⎜ * α M0M1Mm *H4, m+1 ( l)⎟= ⋅⋅⋅ ⎜ ⎟⎝ ⎠⎝ H4 ( 0)⎠α⎛ H( )( 0)0 ⎞ .⎠m/ 2 3= M ⎜ H* ⎟4⎝(8)The total signal gain at the output of the (m+1)thHNLF segment is found to be( ) ( )Gm+ 1= 20lg ⎡ H3, m+1l H30 ⎤⎣ ⎦ . (9)4. Analysis of gain flatnessThe above expression shows that the parametric gainhave a relationship with the section numbers ofdispersion compensation fibers and the high-nonlinearfiber’s length. We will analyze the gain flatness bymodifying the inserting number of DCF segments andthe HNLF’s length.Fig. 2 shows that the gain value and flatness areinfluenced by the inserting number of DCF segments.The HNLF have a zero-dispersion wavelength λ 0 =1591nm, which is less than ideal for making FOPA’soperating near-1550nm, desirable for opticalcommunication systems. The parameters usedcorrespond to our simulations, namely: pump powersare respectively 0.2W and 0.4W; pump wavelengths are1542nm and 1560nm; L=300m; γ=20W -1 km -1 ; β 2 (ω 0 )=-0.01ps 2 /km, β 4 (ω 0 )=-2.85ps 4 /km; β ′ 2(ω 0 )=2ps 2 /km,β′ 4 (ω 0 )=3×10 -4 ps 4 /km.From Fig. 2, we can see that, with the increase of theDCF segment numbers, the gain fluctuation have beenimproved obviously. When m=0,there is no insertingDCF segment in a HNLF, the fluctuation is large, about3.8dB. However, when m=4, inserting four DCFsegments periodically in a HNLF, the gain becameflatter and wider. But the value of gain is reducedbecause of inevitable splicing losses.Gain (dB)108642m=0m=2m=4m=60-100 -50 0 50 100λ 3-λ c(nm)Figure 2. Graph of signal gain with different m versuswavelength. (a) m=4; (b) m=6.Fig. 3 shows the influence of the length of FOPA ongain value and flatness. In Fig. 3(a): m=4, inserting fourDCF segments periodically in a HNLF; Fig. 3(b): m=6,inserting six DCF segments periodically in a HNLF.The parameters are same as the above-mentioned. Asshown clearly in Fig. 3, with the increase of the lengthof HNLF, the value of gain is raised, while the flatnessis reduced.Gain (dB)Gain (dB)54321(a)L=400mL=300mL=200m0-100 -50 0 50 100λ 3-λ c(nm)54321(b)L=400mL=300mL=200m0-100 -50 0 50 100λ 3-λ c(nm)Figure 3. Graph of signal gain with different length of HNLFversus wavelength. (a) m=4; (b) m=6.From Fig. 3(a), m=4, when L=200m, we can obtaina 1.40dB nearly flat gain bandwidth; when L=400m, thegain value reached to 4.43dB, the bandwidth remainedconstant, but the fluctuation increased, about 0.35dB,which can be solved by inserting more DCF segments.117


When m=6, the fluctuation have been improvedobviously, Fig. 3(b).5. ConclusionThis paper has theoretically investigated a two-pumpFOPA realizing quasi-phase matching with periodicdispersion compensation. We have derived the gainexpression of FOPA by use of the coupled equations ofpumps, signal and idler field, then analyze the gainflatness of two-pump FOPA with periodic dispersioncompensation. This study shows that the FOPA’s gaindepends on the inserting numbers of DCF segments andthe HNLF’s length.We have shown that with the increase of theinserting number of DCF segment, the fluctuation ofFOPA gain has been improved obviously, which canrealize flat gain over a considerable wide spectral rang.However, the value of gain reduces because ofinevitable splicing losses, which can be solved byincreasing the length of HNLF. The value of gain isincreased when the HNLF’s length increases, while theflatness is reduced. The parametric gain can beflattened by inserting more DCF segments.This paper investigate theoretically an approach thatrealize quasi-phase matching of two-pump FOPAs,which is of practical significance in designing a FOPAthat have the function of equalizing amplification overa wide spectral range.G. Kazovsky, “High-Nonlinearity Fiber Optical ParametricAmplifier with Periodic Dispersion compensation”, Journalof Lightwave Technology, vol. 17, no. 2, February 1999, pp.210-215.[8] G.A. Nowak, Y.-H. Kao, T.J. Xia, and M.N. Islam, “Lowpowerhigh efficiency wavelength conversion based onmodulational instability in high nonlinearity fiber”, Opt. Lett.,vol. 23, Jun. 1998, pp. 936–938.[9] Jaeyoun Kim, Özdal Boyraz, Jin H. Lim, and MohammedN. Islam, “Gain enhancement in cascaded fiber parametricamplifier with quasi-phase matching: theory and experiment”,Journal of Lightwave Technology, vol. 19, no. 2, 2001,pp.247-251.[10] Ngai Wong, and Kenneth K.Y. Wong, “Gain bandwidthoptimization in two-pump fiber optical parametric amplifiersunder bounded zero-dispersion wavelength fluctuations”,Optics Communications, vol. 272, no. 2, 2007, pp. 514-520.References[1] J.Hansryd, and P.A. Andrekson, “Broad-band continuous-wave-pumped fiber optical parametric amplifier with 49-dBgain and wavelength conversion efficiency”, IEEE Photon.Technol. Lett., vol. 13, no. 3, Mar. 2001, pp. 194–196.[2] J. Hansryd, P.A. Andrekson, M. Westlund, J. Lie, and P.-O. Hedekvist, “Fiber-based parametric amplifiers and theirapplications”, IEEE J. Select Topics Quantum Electron., vol.8, no. 3, May/Jun. 2002, pp. 506–520.[3] J.L. Blows, and S.E. French, ‘‘Low-noise-figure opticalparametric amplifier with a continuous-wave frequencymodulated pump’’, Opt. Lett., vol. 27, 2002, pp. 491–493.[4] M.E. Marhic, N. Kagi, T.-K. Chiang, and L.G. Kazovsky,‘‘Broadband fiber optical parametric amplifiers’’, Opt. Lett.,vol. 21, 1996, pp. 573–575.[5] Li Qiliang, Li Yuanmin, and Qian Sheng, “Gain ofcascaded two-pump fiber-optical parametric amplifier withhigh order dispersion”, Chinese J. Lasers, vol. 33, no. 6,2006, pp. 760-764.[6] Laurent Provino, Arnaud Mussot, Eric Lantz, ThibautSylvestre, and Hervé Maillotte, “Brodband and flatparametric amplifiers with a multisection dispersion-Tailorednonlinear fiber arrangement”, J. Opt. Soc. Am. B., vol. 20, no.7, 2003, pp. 1532-1537.[7] M.E. Marhic, Frank S. Yang, Min-Chen Ho, and Leonid118


ISIS – Urban Radio plan and time-variantcharacteristics of mobile vehicular networkSerenus Jeyakumar, David Linton, senior member IEEEInstitute of Electronics, Communications and Information Technology (ECIT)Queen’s University Belfast, BT3 9DTFax: +44(0) 28 9097 1702; Tel: +44(0) 28 9097 1808Email: sjeyakumar01@qub.ac.uk, d.linton@ee.qub.ac.ukAbstract - Integrated Sensor Information System(ISIS) will detect potential criminal/terrorist threatson the bus public transport network usingaudio/visual/RF sensors and inform decisionmakers at a central control point who use thisinformation to decide on the perceived threat level.The ISIS system will also manage its own networkso that it is not reliant on commercial mobile phonenetworks. iBurst technology was proposed to beused for communication between a mobile bus andthe Central Control Point. This paper is a study ofthe radio plan, propagation channel and timevariant characteristics of an iBurst system in avehicular network system.Keywords: iBurst, Urban propagationmodelling, Vehicular Network, sensor.network. The proposed system would have thepotential to reduce assaults on public and staff ontransport units. Research of a similar nature isbeing conducted on trains in France under theCeltic-BOSS project [2] but there is almost noresearch being performed on buses.I IntroductionFrom recent occurrences of crime andterrorist attacks in public transport, it is difficult toguarantee safety of the public on board thesetransportation units. However, it is possible torecord the events and crimes taking place, throughCCTV and then analyzing it later to capture thecriminal. The problem is that there are no means toprevent the crime from happening and actions aretaken against the criminal only after a crime hasoccurred leaving the victim, if they survive theattack, traumatized for an unknown period of timedepending on the crime committed.Some examples of major crime/terroristattacks on public transportation in the near past arethe 9/11 airplane incidents, 7/7 bombings inLondon, train Bombings in Madrid, Spain andMumbai, India. In addition to these, we also haveto consider the smaller attacks aimed at theindividual person with knives, guns, etc andfraudulent claims on incidents that may not haveoccurred on buses. In response to this, governmentshave made a call upon the scientific andengineering research communities to play animportant role in the effort to fight crime andterrorism [1].ISIS aims towards working against thesecrimes on buses by providing an integrated securitysystem within the platform that communicates witha control centre. ISIS will detect threats and informdecision makers of that threat and manage its ownFig1. Overview of ISISThis paper associated with ISIS,overviews the wireless and electromagneticsystems. A number of simulations using ray tracingsoftware and electromagnetic simulation tools areused to study the effect of microwaves in acommunication channel between a mobile bus anda network operation centre and use this data tobuild an optimum and robust wireless network.Studies and measurements have been performed onthe propagation modelling of populated indoorenvironments and the effect of human beings onthe propagation of microwaves [3].As the data exchange between the bus andthe Network operation centre is crucial, theproposed network should not have significantvariation in received power levels during theexchange of data. This variation is due to fading,environmental clutter and other characteristics thataffect the propagation of microwaves [4].II. iBurst technologyiBurst wireless broadband system wasinitially developed by ArrayComm in 2001 and hasbeen issued under the IEEE 802.20 standard. iBurstis licensed to operate within the frequency range of1.7 GHz to 2.3 GHz.119


Across a 5 MHz spectrum, the maximumdata rate reaches up to 1,061 kbps for downlinkand 346 kbps for uplink. iBurst can also achievehigh data rates, even during the high speedmovement of user terminals [5], [6]. The iBurstsystem uses Asymmetric TDD/TDMA framedesigned for 625 kHz channelization and performsSpatial Division Multiple Access (SDMA) byusing Adaptive Antenna Array technology andadaptive modulation and coding for efficient use offrequency resources [4]. The Adaptive AntennaArrays have 12 antenna elements at the BaseStation (BS) which enhance signals transmitted toand received from the User Terminal (UT), whilesuppressing interference signals from other UTs inthe same time slot and in the same frequency band,but in different angles of arrival [6], [7]. Therehave been studies on the outdoor to indoorpropagation characteristics of iBurst in an urbanand suburban area at stationary UTs. These studiesshowed an investigation of the temporal signalvariation observed by wireless broadband channelsin urban and suburban environments [6].The iBurst base stations in Belfastoperating at 1795 MHz to 1800 MHz, are undertrial. iBurst technology was proposed to be usedwith the ISIS project in the vehicular networksystem for the transfer of data from sensors withinthe bus to a network operation centre where thedata will be monitored.IV. DatabaseA bus route from the Titanic Quarter inBelfast through the City Centre to the SouthBelfast area was chosen for study. The proposedbus route is approximately 6.5 Km long andcomprises of urban (Densely populated) andsuburban (Less dense) clutter. Clutter data wasobtained by purchasing rooftop outlines from theOrdinance Survey Northern Ireland (OSNI). Thisdata was imported into the ray tracing software(WinProp) and the approximate height ofbuildings was entered manually.Fig2. OSNI Tiles map of Belfast City CentreIII SensorsThe data transmitted over the vehicularnetwork system is sourced from sensors such asCCTV cameras, CAN bus sensors and audiosensors to be found in different locations on boardthe bus. 16 video cameras are located in differentpositions forming a closed circuit to monitor eventsand incidents happening on board the bus. The datafrom the CCTV cameras will be transmitted fromthe bus through the proposed communicationschannel to the network operation centre to informdecision makers of the perceived threat level.Controller Area Network (CAN) bussensors will be integrated with the system and acombination of data from the CAN bus sensors andthe CCTV cameras will provide evidence againstfraudulent claims by passengers on buses after anevent or incident has occurred. Audio sensors willbe used to study and analyze the voice patterns ofpassengers on board the bus. For this data to betransmitted over a vehicular network system arobust wireless network for live transmission ofcrucial data is required and hence a radio plan ofthe city is required to study and analyze thereceived/transmitted power signals and propagationpaths within the network for a moving vehicle.This study is performed using WinProp, a raytracing software and is shown below.Fig3. 3D view of Belfast City Centre in WinPropV. Urban Radio planWinProp is the Ray tracing software toolused to study the power distribution, propagationpaths and time variant properties of signals in anurban or suburban environment. Drawings obtainedfrom WrightBus were used to build a model businto WinProp. The UT (Bus) is programmed tomove along a designated path while the predictionsfrom the BS to the UT are being computed.Fig 4 shows the propagation paths andreceived power at the UT from the BS. This is asuburban environment with less clutter and hencethere is no critical signal variation with timevariance.Fig 5 shows the propagation paths andreceived power at the UT from the BS in the CityCentre (Urban Environment). Due to dense clutterand multipath propagation, there are significantfading effects predicted. Screenshots of the CityCentre in Belfast from the simulation and fromGoogle maps are shown for comparison of clutterdata.120


Fig4. Prediction of radio propagation in a less dense urban environmentFig5. Prediction of radio propagation in an urban environmentFig6.Propagation paths and signal strength variation with time and motion of the bus121


Fig 6 shows the time variant characteristicsof the received signal power and propagation pathsfrom the BS to the mobile UT. In a sample takenapproximately 120m long from a random point A todestination B, there is a significant signal strengthfluctuation of approximately 23 dBm. Consideringthe importance of the data being exchanged betweenthe UT and the BS in the ISIS project, suchvariations in signal strength should be minimised forthe transmission of crucial information.VI ConclusionA radio plan on two different locationswith different clutter data was simulated from asingle base station. It was found that there was morescattering, signal loss and signal strength fluctuationin an urban area with dense clutter against asuburban area. This data can be used to create amore robust wireless network. Future works includeenhancing the sensor network and a radio plansimulation of the bus route using an iBurst networksystem with three base stations and carrying outmeasurements of data transmitted from the sensorson the bus through the iBurst network.and Propagation Vol. 46, No. 7, July 1998 pp. 953 –962[5]http://global.kyocera.com/prdct/telecom/office/iburst/technicaloverview.pdf[6] Hajime Suzuki, Carol D. Wilson and Karla Ziri-Castro, “Time variation Characteristics of WirelessBroadband Channel in Urban areas” , 1st EuropeanConference on Antennas and Propagation, 6-10Nov. 2006, Nice, France.[7] Martin Cooper, Marc Goldburg, “IntelligentAntennas: Spatial Division Multiple Access” 1996Annual Review Of Communications, pp. 999 - 1002VII AcknowledgementsThe authors would like to thank theEngineering and Physical Sciences ResearchCouncil (EPSRC) for the funding granted towardsthe ISIS project and the ISIS Consortium membersfor their involvement with the ISIS project.References[1] Hsinchun Chen, Fei-Yue Wang, Daniel Zeng,’’Intelligence and Security Informatics forHomeland Security: Information, Communication,and Transportation’’ IEEE Transactions onIntelligent Transportation Systems, Vol. 5, no. 4, pp.329 -- 341, December 2004.[2] D. Sanz, V. Delcourt, O. Gatin, M. Berbineau, S.Ambellouis, L. Khoudou, ‘‘BOSS: intelligentembedded video surveillance system with real-timetransmission of video to the control centre[3] K. I. Ziri-Castro, N. E. Evans and W. G.Scanlon, “Propagation Modelling andMeasurements in a Populated Indoor Environmentat 5.2 GHz” In: <strong>Proceedings</strong> of the 1st IEEEInternational Conference on Wireless Broadbandand Ultra Wideband Communications, 13-16 March2006, Sydney, Australia[4] Shiann-Shiun Jeng, Guanghan Xu, Hsin-PiaoLin, and Wolfhard J. Vogel, “Experimental Studiesof Spatial Signature Variation at 900 MHz for SmartAntenna Systems” ,IEEE Transactions on Antennas122


Investigation of Dispersive Fading in UWB Over Fiber Systems.A. Castillo, P. Perry, P. Anandarajah and L.P. BarryRINCE, Dublin City University, Dublin, Ireland. Antonio.Castillo@eeng.dcu.ieAbstractWe present an analysis of the response andperformance of an optical system for high bit-rate andlong-distance RF signal transmission combining thetechnologies UWB (Ultra-WideBand) and ROF (RadioOver Fibre). We analyse the frequency response of thesystem and measure the PER obtained at the receiver.This way, we examine the relationship between themain system parameters such as bias current of thelaser, frequency response and length of fiber used. Wecompare the results for two different lasers, obtaininga consistent relationship for a low bandwidth laser, buta more complicated, and inconsistent behaviour in thecase that a higher bandwidth laser is used.I. IntroductionRadio Over Fibre (ROF) is a technology that offersthe possibility of transmitting RF signals, in a transparentway, over an optical fibre distribution system. It works forlong distances with low latency and attenuation, but with afrequency response that is affected by dispersive fadingand nonlinear effects in the laser and fiber.Ultra Wide Band (UWB) technology, since itsregulation by FCC in February 2002, has been developedas a commercially attractive method to implement highspeed wireless transmission in different end-usertechnologies with low power consumption. To achievethis, it uses Multi-Band Orthogonal Frequency DivisionMultiplexing (MB-OFDM) in 528Mhz sub-bands, with128 sub-carriers each. It achieves maximum speedsbetween 320Mbps and 480Mbps, and it is expected that itreaches 1Gbps in the near future with new draftspecifications.In this investigation we use the WiSair 9110Developers Kit which generates an UWB signal in bandgroup 1, from 3.1 to 4.7Ghz, using three sub-bands. Theboard directly modulates a laser connected through singlemode fibre to a photodetector, which in turn feeds thereceiver on the peer WiSair board. This system thereforeoffers a low cost way to extend the reach of UWBsystems, only needing an opto-electrical conversion in thereceiver ends to be able to radiate the converted signalthrough UWB antennas using simple transceivers.VNALaserdiodePhotodetectorSMFLaserdiodePhotodetectorWisairUWB TXWisairUWB RXFig. 1: System setup for dispersive fading test.123


In the system, the RF signal is directly modulatedonto the optical carrier as a double sideband (where bothsidebands are separated by twice the RF carrierfrequency), although different methods could be used toimprove these results.Due to fibre dispersion, these sidebands propagate atdifferent speeds, so that the received RF signalexperiences fading behaviour [1]. This causes signalstrength reduction and inter-symbol interference resultingin errored bits that can often be corrected by ForwardError Correction (FEC). Although, if the level of FEC isnot sufficient to overcome the channel degradation, thenpacket errors will result.In this paper, we evaluate the frequency dispersionbehaviour of the ROF system and compare this with theapplication layer level performance. Measuring the PacketError Rate (PER) obtained in the receiver for two differentlasers.We show the relationship between parameters and theimportance of adjusting them depending on the opticalpower, RF power and the length of fiber used in thetransmission, given that the frequency response we willobtain from the system depends on them.II. Dispersive fadingTo analyze dispersive fading and the impact it causeson the overall frequency response, a Vector NetworkAnalyzer (VNA) was connected to both the laser andphoto-detector at the ends of the optical transmissionsystem, as shown by the dotted lines in figure 1. Thefrequency response obtained typically shows atransmission notch [3] that at first appears at a higherfrequency than the operating band. The measurementsdone show that increasing transmission length will make itshift down in frequency and interfere with the RF signal.This is a dispersive fading effect which will causedegradation and should be addressed [6].Thus, the frequency response was measured fordifferent lengths of fiber and laser bias currents using twodifferent lasers. Figure 2 illustrates the frequency responseof a commercially available single mode laser used in theRoF system with 37km of Single Mode Fibre (SMF). Thelaser was a temperature controlled hermetically sealedhigh-speed butterfly package, with a bandwidth of about16GHz at a bias current of 30mA. It has a roomtemperature emission wavelength of 1545nm and athreshold current of 10mA. Tests show that the nonlinearities in the laser become more pronounced for lowerFig. 2: High bandwidth laser frequency response of the system for a 37km fibertransmission length and different bias currents applied.124


ias points and cause the dispersive fading notch to appearat unpredictable frequencies and vary in depth. This is dueto the presence of harmonics of the modulating RF signalas multiple sidebands on the optical carrier. It makes thebehaviour very complex and system performance difficultto predict, as can be noticed in the graph, where adifference of some milliamps in bias current can make thenotch change from a 5dB to a 30dB depth and shift infrequency position. These results show that furtherinvestigation would be needed to understand these effectsand obtain an adequate response and integrate this type oflaser in an UWB ROF system.The solution chosen to overcome this unpredictablebehaviour was the use of a lower bandwidth laser so thathigher order harmonics could not be generated. This way,the negative effects affecting our system response wouldbe avoided. As expected in this case, the dispersivefading notches appeared at the expected frequencies anddid not shift with changing laser diode bias current,making the system behaviour much more stable and easyto study.Figure 3, shows the frequency response of thiscommercially available single mode laser in ahermetically sealed TO-can. It has an emissionwavelength of 1540nm and a threshold current of 14mA.Results indicate that the laser has an inherent resonancearound 2GHz in its back to back performance due topackaging parasitics, but has reasonably flat response atthe Band Group 3 frequencies that we will use. Thedispersive fading behaviour is also clearly shown here asthe notch appears at 7GHz for 14km, 4.5GHz at 37km and3.8GHz at 51km. These curves are not bias dependent,which indicates a high degree of linearity in the system.Using this laser, it will be easier to predict systemperformance and tune parameters to obtain the betterpossible PER, as opposed to the system behaviour for thehigh bandwidth laser.Both results shown in figure 2 and 3 have the workingband of our system, between 3.1 and 4.7 GHz,highlighted.III. Packet error rate (PER)To characterize the achievable PER, a bidirectionalsystem was set up, because of the requirement of theWisair boards to see each other to establish transmission,as indicated by the solid lines in figure 1. As shown there,the uplink optical connection was done back to back,instead of over the fiber reel, to avoid using multipleoptical amplifiers and to isolate the performance of asingle transmitting path. The transmission parameterschanged for the tests were the bias current of the laser, thefiber length and the bitrate of the transmission, using mostFig. 3: Low bandwidth laser frequency response of the system for different fibertransmission lengths and a 25mA bias current applied.125


of the different speeds UWB standard supports:53.3Mbps, 80Mbps, 160Mbps, 320Mbps and 480Mbps.PER was measured from the receiver Wisair boardapplication software for the different scenarios [2], [5].The PER graphs for the high bandwidth laser areshown in figure 4 and indicate rather peculiar behaviourwhich seems to be a result of the complex interaction ofthe laser nonlinearities and the fibre dispersion, so that theeffective channel the Wisair Radio is experiencing isunpredictable. However, the results in figure 5 showingthe performance for the low bandwidth laser correlate wellFig. 4: High bandwidth laser 10% PER points for different transmission bitrates and fiberlengths with: a) 15mA, b) 20mA and c) 25mA bias currents applied to the laser.126


with the frequency response graphs seen before in figure3. It can be seen from these graphs that transmissioncannot be established for distances greater than 14km forthe 480Mbps case as this scheme uses no FEC. Bycontrast, reduced bit rates use higher levels of FEC andare capable of overcoming the channel degradationscaused by dispersive fading.Moreover, for the low bandwidth laser, DispersionCompensating Fiber (DCF) could be used to improve PERand obtain at 37km a similar performance to the one at12km. By contrast using DCF did not improve systemperformance when using the higher bandwidth laser. Thisis explained by the fact that the main phenomenonaffecting performance for the low bandwidth laser isdispersion. But additional effects occur in the system forthe higher bandwidth laser, keeping the performance low.It is also interesting to note that the received opticalpower to achieve 10% PER needs to be approximatelyFig. 5: Low bandwidth laser 10% PER points for different transmission bitrates and fiberlengths with: a) 20mA and b) 25mA bias currents applied to the laser.127


3dB higher for the laser biased at 25mA than that at 20mAfor a given distance. This is due to the reduction inmodulation depth and the associated increase in carrierpower when bias current is increased.IV. ConclusionsIn this paper we have shown that the simpledispersive fading model is insufficient for explaining thebehaviour of UWBoF systems when the directlymodulated lasers have a high bandwidth. By contrast, alow cost laser with restricted bandwidth offers improvedsystem performance that is predictable and could beintegrated in commercial systems.Furthermore, the increased receiver sensitivity thatcan be gained by using a low bias current will givebenefits for short lengths of fibre, needing less power toobtain similar error rates.4. T. Alves, A. Cartaxo, Performance Degradation Dueto OFDM-UWB Radio Signal Transmission AlongDispersive Single-Mode Fiber, IEEE PhotonicsTechnology Letters (<strong>2009</strong>)5. A. Pizzinat, P. Urvoas, 1.92Gbit/s MB-OFDM UltraWide Band Radio Transmission over Low BandwidthMultimode Fiber, OFC/NFOEC (2007)6. Y. Ben-Ezra, M. Ran, Wimedia-Defined, Ultra-Wideband Radio Transmission over Optical Fibre,OFC/NFOEC (2008)7. A.J.Lowery, Fiber nonlinearity pre- and postcompensationfor long-haul optical links usingOFDM, vol 15, no 20, Optics Express (2007)To summarize, it is shown that these systems have aclear practical application and could be used to distributeUWB signals with many advantages over otherapproaches. It could be done using optical fiber for a longdistance with low attenuation and with a good frequencybehaviour, taking into account the relationship betweenthe different parameters involved and studied in thispaper.AcknowledgementsWe would like to thank Linda Doyle of TrinityCollege Dublin for the loan of the Wisair boards used inthis work and the company Wisair for the technicalsupport.References1. S. Yaakobl, W. R. Wan Abdullahl, Effect of LaserBias Current to the Third Order Intermodulation inthe Radio over Fibre System, International RF andMicrowave Conference <strong>Proceedings</strong> (2006)2. M.L. Yee, V.H. Pham, Performance Evaluation ofMB-OFDM Ultra-Wideband Signals over Singlemode Fiber, IEEE International Conference onUltra-Wideband ICUWB (2007)3. F. Ramos, J. Martí, Frequency Transfer Function ofDispersive and Nonlinear Single-Mode OpticalFibers in Microwave Optical Systems, IEEEPhotonics Technology Letters (2000)128


An Efficient Consolidated Authentication Scheme for the Handover Process inHeterogeneous NetworksMei Song 1 , Li Wang 1 , Xiaojun Wang 2 , Lei Wang 1 , Junde Song 11 School of Electronic Engineering,Beijing University of Posts and Telecommunications, Beijing, P.R. Chinasongm@bupt.edu.cn2 School of Electronic Engineering, Dublin City University, Ireland,xiaojun.wang@dcu.ieAbstract—In order to reduce the authentication latency duringhandover process between different wireless networks, thispaper proposes a novel consolidated authentication scheme, inwhich the re-authentication and the pre-authentication areperformed. When a mobile node (MN) moves into a newdomain, its identity must be verified by the EAP (ExtensibleAuthentication Protocol) server. In this paper, EAP-MD5method is adopted for the authentication process. After thefirst authentication, the re-authentication method is executedto reduce the time required for producing master key (MSK)between the MN and the EAP server before the MN leaves thedomain. If MN moves into a new domain, the preauthenticationmethod will be carried out to reduce therequired authentication time. Theoretical analysis andperformance evaluation of the proposed scheme are presented,demonstrating the scheme’s efficiency. Finally, the detailedimplementation of the EAP-MD5 by using Open Diameter isgiven.Keywords-consolidated authentication; Open Diameter;pre-authentication; re-authenticationI. INTRODUCTIONDriven by increasing user demand, an integrated accesstechnology for seamless wireless communication mobilityis becoming one of the objectives of next generationwireless communication technologies. When the mobileequipment changes its access point, the continuity ofservice will be affected by the relevant handover process.During the handover, it is necessary to provide someauthentication mechanisms to guarantee the security ofresource configuration and distribution. The latency ofauthentication is becoming the bottleneck of fast handover.Many research groups focus on the fast authenticationduring the handover process. Document [1] gives adescription of the EAP fast re-authentication method andrelated key management framework. However, it is onlysuitable for intra domain handover. The EAP preauthenticationproblem statement and related thought arepresented in [2], while the discussion of extending PANA(Protocol for Carrying Authentication for Network Access)for pre-authentication is documented in [3]. All thereferences mentioned above have considered differentscenarios but none of them provide a cooperativemechanism to perform the consolidated intra domainhandover and inter domain handover.Through analyzing the statement of authenticationmechanism during handover process, this paper proposes anovel consolidated authentication scheme, in which theEAP pre-authentication (PA) and EAP re-authentication(RA) are adopted, and a central authentication server isdefined to manage the detailed pre-authentication betweendifferent domain and re-authentication in the same domain,so as to reduce the authentication latency further.II.PROPOSED CONSOLIDATED AUTHENTICATIONSCHEMEA. Hierarchical Authentication ArchitectureIn order to reduce the time for signaling so as to reducethe handover latency, this section firstly proposes ahierarchical authentication architecture (HAA), which canbe divided into two layers, home EAP server and local RAservers. From figure 1, it can be seen that HAA iscomposed of several authentication domains. There aremany Intra domain RA authenticators and Inter domainRA authenticators in each authentication domain, and alocal RA server is deployed in the center of the domain.Figure 1. The hierarchical authentication architectureIn order to describe the problem clearly, figure 1 onlygives a detailed description of two authentication domains,which are authentication domain 2 and authenticationdomain 3. In each domain, a local RA server will performthe re-authentication process for the mobile peer. The localRA server is also responsible for requesting the key129


information for inter domain pre-authentication, and forcollecting and saving the information of neighbors for theinter domain RA authenticators.This scheme is proposed based on the EAP framework,which relies on the EAP method layer to perform thespecific authentication process. The pass-throughauthentication model can be seen from figure 2. In addition,the key material, including Master Session Key (MSK) andextended Master Session Key (EMSK), will be producedby the EAP methods, such as EAP-MD5.Figure 3. Initialization process for the re-authentication procedureFigure 2. The pass-through authentication modelBefore taking the inter domain pre-authentication, itshould be assumed that the serving authenticator coulddetect the information of neighboring candidateauthenticators (CA), including the authentication domainname and the name of the local RA server. However, theneighbor information discovered should be stored in thelocal RA server. The detailed topology of neighbors isdefined when each domain is deployed and could beadjusted dynamically.Generally speaking, pre-authentication requires anaddress of a RA candidate authenticator (CRA) to bediscovered either by a peer or by a serving RAauthenticator (SRA) or by some other entities beforehandover. In this paper, the authenticator discoveryprotocol, which is typically defined as a separate protocol,is not considered. For both intra-domain and inter-domainhandover, the IP address of a candidate authenticator mustbe reachable by the peer or the SRA that is performing thepre-authentication. Note that, an authenticator discoveryrequires a database of the neighboring network information.B. EAP Re-authentication for Intra HandoverFigure 1 illustrates the EAP re-authenticationprocedures when a mobile peer moves between differentauthenticators within a same authentication domain. Thenfigure 3 and figure 4 show the initialization process and theEAP re-authentication procedure respectively.In authentication domain 2, the mobile peer needs toknow the name of the local authentication domain or thename of local RA server when it first enters the domain.As is shown in figure 3, the normal EAP process will beperformed first, including EAP request and EAP responsemessages. Between the mobile peer and the EAP server, aMSK will be produced and sent to the authenticator (SRA),used for making the secure connection subsequently.During the process of EAP exchange, the EMSK isproduced between the mobile peer and the EAP server.During the EAP re-authentication process, the local RAserver will request a Domain Specific Root Key (DSRK)for the home EAP server by sending its own domainname. Then the home EAP server will calculate theDSRK based on the EMSK,and send the DSRK and itsEMSK name to the local RA server. On receiving therelated messages, the re-authentication Root Key(DSrRK)andre-authentication Integrity Key(DS-rIK)will be calculated using DSRK, meanwhile, DS-rRK andDS-rIK are gotten by the mobile peer. When the mobilepeer moves from the SRA to the nSRA (next SRA), theEAP re-authentication will be triggered. nSRA will sendEAP-Initiate/Re-auth-Start message to indicate the startof EAP re-authentication. After receiving the EAPInitiate/Re-auth message, the local RA server uses thestill valid DS-rRK to produce a re-authentication MSK(rMSK), and send it to nSRA, as shown in figure 4.Similarly, based on DS-rRK the mobile peer can alsogenerate rMSK which can be used as Master Session Key(MSK) for secure connection. Here, it is unnecessary toretrieve the MSK from home EAP server, and themultiple information exchange operations between thepeer and the EAP server could be omitted. The proposedscheme can thus significantly reduce the handoverlatency, especially in the roaming mode when the mobilepeer is far away from the home EAP server with longersignal round trips.Figure 4. Re-authentication procedureC. EAP Pre-authentication for Inter HandoverFrom figure 1, we can see that when the mobile peermoves along route b, it crosses the boundaries of theauthentication domain 2 and domain 3. The intra domainre-authentication scheme devised in the above section is130


not applicable here, because the authenticator SRA andauthenticator CRA are located in different authenticationdomains. The key material (DS-rRK and DS-rIK) fromauthentication domain 2 is not allowed to be used inauthentication domain 3. Therefore, we propose a novelconsolidated inter domain authentication scheme. Figure 5shows the detailed procedure of pre-authenticationbetween SRA and CRA.In figure 5, when a mobile peer moves to the SRA,which stands at the border of authentication domain 2, theSRA first carries out the re-authentication process, andthen it starts the pre-authentication process. The detailedmessage flow is as follows.When a mobile peer makes its authentication throughSRA, the local RA server 2 will choose its CA and start thepre-authentication process by sending EAP Pre-authRequest/Identity from SRA to CRA. On receiving the EAPPre-auth Response/Identity message from the peer, theSRA requests its neighbor information from local RAserver 2. The mapping information of SRA-CRA stored inlocal RA server 2 will be sent to SRA, and relatedinformation of CRA will be sent to the mobile peer, suchas the domain name of CRA. After receiving its neighborinformation, the SRA sends the received EAP Pre-authResponse/Identity message to local RA server 3 throughCRA. Subsequently, local RA server 3 will request thevalid key material, such as DSRK, and EMSK name. Then,DS-rRK and DS-rIK will be calculated and stored.A. ImplementationThe software used in this paper is open source. For theAAA architecture and the Diameter protocol, theopendiameter implementation is used. It supports differentDiameter applications such as EAP, PANA and NASREQ.Opendiameter-1.0.7-i is used in this paper. The baseprotocol implementation is available as a C++ library andcurrently supports Linux, BSD and Windows systems. Animplementation of EAP and PANA for client/user networkaccess is also available under the Open Diameter project.Currently, the EAP stack can already support methodsincluding EAP-MD5, EAP-TLS, EAP-AKA and EAP-GPSK. Note that, this section will take EAP-MD5 forexample. By using of EAP-MD5, the whole PANA latencybetween PAC and PAA is about 2.43 seconds, when theMN first moves into a new domain to make its firstauthentication.In summary, the state machine of pass throughauthentication model can be seen in figure 6. When thepass-through authenticator receives a message, it willchange from the “IDLE” state to the “Received” state.Authenticator starts to process the received message. If themessage is the authentication request sent from thebackend server, the authenticator gets into “SendRequest”state, and forwards the request to the peer. If the messageis the authentication response sent from the peer, theauthenticator gets into “SendResponse” state, and forwardsthe response to the backend authentication server. If themessage is invalid, the authenticator discards the message,and returns to the IDLE state unconditionally. After eachsending action (sending request/response), theauthenticator will return to the IDLE state.Figure 5. Pre-authentication procedureWhen the mobile peer moves into domain 3, the rMSKgenerated in domain 2 could be sent to CRA. In addition,the new access domain name obtained by the mobile peercan be used to produce new rMSK. Thus, a secureconnection could be established. In the pre-authenticationmode, the signaling operations are restricted between theRA server and the mobile peer. It does not need to contactthe home EAP server.III.PERFORMANCE EVALUATION ANDIMPLEMENTATIONIn order to reduce the system cost and to reduce thehandover delay further, we proposed a consolidatedauthentication scheme in section II. In this section, adetailed implementation and a performance analysis of theproposed scheme will be given.Figure 6. State machine for pass through authentication modelFigure 7. EAP-MD5 state machine on the MN131


In addition, the opendiameter implementation usesdifferent state machines to model the different EAPmethods. In this section, we take EAP-MD5 as an example.Figure 7 shows the state machine for the EAP-MD5method that is used on the MN [7]. StInitialize is the initialstate and StSuccess is the final state. The events thattrigger the transitions are named asEvSgIntegrityCheck, EvSgReject andEvSgAccept respectively. The actions that get startedfor every transition could be named asAcDoIntegrityCheck, AcNotifyInvalid andAcNotifySuccess. Figure 8 shows the state machinefor the EAP-MD5 method that is used on the AAAH.Figure 8. EAP-MD5 state machine on the AAAHB. Performance AnalysisIn this paper, EAP framework is used for guaranteeingthe heterogeneous requirements of future interworkingnetworks. When handover process occurs, the normal EAPauthentication process will be triggered. It should be notedthat, after sending EAP Response/Identity message by MN,at least one interactive operation is needed to produceMSK, the time required for the operations depends on theEAP methods used. The process of producing MSK willincur additional cost, in terms of delay which isundesirable, especially in the roaming case.In this proposed scheme, when a MN enters into a newdomain, it will perform its first authentication process, inwhich the normal EAP methods are used, such as EAP-MD5 and EAP-TLS [8]. Then, before the MN moves outto another new domain, the re-authentication mode can beused to make the following authentication. On the otherhand, when the MN leaves its current domain, the preauthenticationmode will be adopted. The detailedauthentication signaling operation time comparison can beseen from the following table.Table 1. Performance comparisonEach RA server in each authentication domain willplay a critical role. It not only performs re-authenticationin its own local domain, but also collects and manages theneighbor information during the pre-authentication process.Before the intra domain handover, the initialization of intradomain authentication is executed in advance, with an aimto establish secure connection by using the obtained DSrRKand DS-rIK etc. Then the remaining authenticationprocess needs not to contact the EAP server. And only oneround trip is necessary between the peer and the local RAserver, as can be seen in table 1. In summary, the proposedscheme can significantly reduce the total authenticationcost in terms of both the number of authentication signalsand the distances of the authentication signals have totravel.IV. CONCLUSIONSBased on the EAP framework, this paper proposes aconsolidated authentication scheme, in which reauthenticationis used in intra domain handover process,and pre-authentication is adopted in inter domain handoverprocess. The performance analysis demonstrates that theproposed scheme reduces the authentication cost.Moreover, if the movement of a mobile peer could betracked, then the number of candidate authenticators canbe reduced, so as to reduce the additional cost for the preauthenticationinitialization process. The authors willconduct further research in this area.ACKNOWLEDGMENTThis work was supported by the National High-TechResearch and Development Plan of China (No.2007AA01Z226).REFERENCES[1] B.Aboba, D.Simon and P. Eronen. "Extensible AuthenticationProtocol (EAP) Key Management Framework", IETF RFC 5247,August 2008.[2] V.Narayanan and L. Dondeti. "EAP Extensions for EAP ReauthenticationProtocol (ERP)", RFC 5296, August 2008.[3] Q.Wu and Y.Ohba,”EAP Pre-authentication Problem Statement”,IETF draft-ietf-hokey-preauth-ps-06, March 8,<strong>2009</strong>.[4] J.Salowey, L.Dondeti, V.Narayanan and M.Nakhjiri,”Specificationfor the Derivation of Root Keys from an Extended Master SessionKey (EMSK)”,IETF RFC 5295, August 2008.[5] B.Aboba, L.Blunk, J.Vollbrecht,et al."Extensible AuthenticationProtocol (EAP)", IETF RFC 3748, June 2004.[6] M. Parthasarathy. "Protocol for carrying Authentication forNetwork Access (PANA) Threat Analysis and SecurityRequirements", IETF RFC 4016, March 2005.[7] Elias Diem.”An Authentication Architecture for Network Accessin Multi-Domain Mobile IPv6 Networks”, Diploma Thesis,University of Zurich, Switzerland, March 2007.[8] B.Aboba and D. Simon. "PPP EAP TLS Authentication Protocol",IETF RFC 2716, October 1999.EAP method(proposed)EAP-MD5EAP-TLSRe-authPre-authAuthentication signaling operationtime1.5 round trips (Peer-EAP server)3 round trips (Peer-EAP server)1 round trip (Peer-local RA server)1 round trip (Peer-local RA server)132


Section 3BCOMPUTER VISION133


A new algorithm of edge detection based on soft morphologyShang Junna, Jiang Feng(Hangzhou Dianzi University,Hangzhou Zhejiang 310037)Abstract: An algorithm of edge detection based on mathematical morphology is discussed in the paper. Due to the characteristicsof the basic morphology algorithm of image processing,the common edge detection algorithms with traditional morphologicaledge detection algorithm is analyzed and compared, and the characteristics of each, as well as inadequateness are given here.Combined with the geometric algorithms, applied to binary gray scale image edge detection, based on the classic edge detectionalgorithms and soft filtering properties, the morphology of the soft edge detection algorithm and optimization algorithm is putforward for the edge detection and image de-noising processing.Key words: mathematical morphology; binary gray scale image; edge detection; soft morphology1. IntroductionMathematical Morphology is a new kind of nonlinearalgorithm, using a certain form of structural elements to measureand extract the shape of the corresponding image in order toachieve the image analysis and identification purposes. It is moresuitable for visual information processing and analysis, and thetype of interaction are completed by the basic operations ofErosion and Dilation as well as Open and Closed operations.Edge detection in image processing and computer visionoccupies a special status, a large number of image informationcontained, to reflect the main features of the object. Therefore,the extraction of edge information for image processing is veryimportant.2. Morphological edge detectionof gray imageThe edge of the gray-scale images or structure where ownmore or less mutations, indicating that the end of a region and thebeginning of another regional location. Classical edge detectionmethod is based on space operations, these methods sensitive tonoise, poor anti-noise performance, and the edge of the detectionnoise is strengthened. For the noise pollution, the generalapproach is to filter, and then detect the edge by differentialcoefficient. However, the filter smoothes the image at the sametime, the edge will inevitably be fuzzy, and the test results will beaffected, is not conducive to the image feature extraction, such asfollow-up treatment; the number of links increasing, extend thetime of image processing, it is not fit for the high real-timerequirement occasions.2.1 Commonly used edge detection algorithmEdge extraction usually based on a variety of differentialalgorithm combined use of templates, threshold, smooth andother means. Commonly used edge detection algorithms areRobert, Sobel, Prewitt, Laplacian, LOG and Canny algorithm.Robot is the First-order differential algorithm which usesthe difference between adjacent pixels to detect edge. Thismeasure has high positioning accuracy, but the noise with badrobustness can't be suppressed.Prewitt is a model algorithm for edge detection which usesthe same measure as Sobel Algorithm. This arithmetic detectedge with the pixels' upper, lower, left and right of gray-weightedneighborhood algorithm that these pixels reach the extremum ofprinciple. This measure can smooth the noise but with hugeamount of calculation and its position accuracy is not well.LOG is the Second-order differential algorithm which usespre-smoothing with Gaussian low-pass filter on the image first.Then this arithmetic identifies the image edge of the steep withLaplacian algorithm. At last it generates closed and connectedcontours with zero-gray value by elimination of all internalpoints. It could filter the noise and also smooth the edge.Canny is a kind of filtering method which is a compromisemethod between smoothing and filtering through Gaussiansmoothing filter. This measure has high de-noising. The realweak-edge can be detected and the interference noise will be lesssusceptible. But some edge information also be smoothed. In134


practical application work, it is a complex programming andslower computing work.Commonly used edge detection algorithm MATLABsimulation:2.2 Soft morphologyThrough the edge detection algorithm on the summary canbe seen in the image edge detection in some of the requirementsare as follows:1 First of all, it is able to detect the right edge of theeffective;2 Edge positioning must be high accuracy;3 The best response to detection is a single pixel;4 For the edge of different scales are better able to respondand to minimize missed;5 Unsensitive to noise;6 The sensitivity of detection by the edge of the direction ofthe impact should be small.Figure 1 Prewitt algorithmThese requirements are often contradictory, and it is difficultto be completely unified in one edge detector. Based on theseclassical algorithms, a new optimization algorithm was putforward, such as soft morphological edge detection method,using mathematical morphology method to sort thewell-weighted expansion, corrosion algorithms have been basedon the gray image edge detection algorithm.Based on soft forms of edge detection algorithm and usingthe characters of soft transformation, the edge information can beattained by the difference of the transformed image. Soft form asFigure 2 Rober algorithma result of transformation can be determined by three parameters,namely, B, A, k, so the three parameters become the basis ofedge detection as a result of the different transformationmonotonicity, scalability, and anti-expansion. Takingf ⊕ B,A,k]− f[ for example, Figure 5 is used to explainthe geometric significance of edge detection.B, A as the softelements form the structure of transformation,when in flat areas,due to fall on "the structure of the window" within the gray valueof the points are similar, so even after the transformation ofregional, output and input is not much difference. Once hoppingFigure 3 LoG algorithminto the gray area, gray value due to relatively are largelydifferent, so the output of the transformed image near thegray-scale structural elements in the transition region where thereis a "higher" than the original image. According to the expansionFigure 4 Canny algorithmof soft forms of transformation, there isf ⊕ [ B,A,k]≥ f .If an image subtraction betweenf ⊕ [ B,A,k]and the input the image is taken, thenf ⊕ [ B,A,k]− f reflects the image of the edgeinformation. This is because when in flat areas,f ⊕ [ B,A,k]and f close to the zero difference. In thehopping gray area, there are much difference between them.135


Control structure elements of shape and size, the impact ofextraction on the edge of their predecessors have been devoted,and here only to discuss the selection of k value of the impact onthe Edge.f ⊕[ B,A,k]As Figure 6 shows: To ,when k f ⊕[B,A,k ] .12 5;3 3];Saving all the point covered by hardcore, and loop k times.According the basic algorithm of soft morphology:f ⊕ [ B , A,k ] − f , for any A,B,k,The simulation result is given as figure 7.In this way, for the case of noise, the algorithm can beconstructed byf ⊕[1B,A,k]− f ⊕[B,A,k ]order to extracting the edge an erasing the noise can be done atthe same time., inFigure 5 Geometric significance of edge detection schematicFigure 7 Algorithm of gray soft morphology2.3 The modified algorithm of edge detectionbased on soft morphologyThe original algorithm is:f ⊕ [ B , A,k ] − fThe image will be inflated at first and then get a differencebetween the new inflated and the raw image.Figure 6 Figure of different K valueSo the soft algorithm is following: Select the structuralelement -> Define hardcore element -> Determine K value ->Traverse the image by structural elements ->Hardcore loops KThe modified algorithm is:f ⊕ B,A,k]− f ⊕[B,A,k]Θ[B , A , ] ,[1 1k1for any A,B,k,A1,B1,k1The image will be inflated at first also, but then get adifference with opening operation of the raw image. Thesimulation result is given as figure 8.times and traverse image points, while stores the remained pointsin array -> Sort the array and select the small value of the first K-> Subtract with the original image.In the simulation, the structure elements(SE) is selected asSE=[1 1 1 1 1;1 1 1 1 1;1 1 1 1 1]*180The shape of SE is picked as a convexity, so that moreuseful information can be got.Hardcore is selected asHardcore =[1 3;2 1;2 2;2 3;2 4;Figure 8 Modified algorithm of edge detection based onsoft morphologyIt is indicated in the figure that the soft morphologyexchange can restrain the image noise efficiently and detect theedge of the image under strong noise condition. But when the136


edge is complex, the edge detected by simple soft morphologywill be coarser, and the new algorithm of this paper can attain asingle pixel wide edge.3. ConclusionMathematical morphology as an effective non-linear imageprocessing methods and theory, in the image processing, patternrecognition and computer vision and other fields, the applicationare very important. Based on mathematical morphology edgedetection algorithm, it can overcome the shortcomings of thetraditional edge detection algorithm, but at the same time as aresult of a morphological filter characteristics, the use ofmorphological transformation, the edge detected relatively wide.This paper focuses on the image edge detection algorithm basedon mathematical morphologyIn this paper, the commonly used edge detection algorithmas well as traditional morphological edge detection algorithm areanalyzed and compared,pointing out that the characteristics ofeach, as well as inadequate, put forward a soft morphologicalMorphology in Image Edge Detection [J]. ChineseJournal of Image and Graphics, 2000, 5A (4): 284 -287.[6] Wei Gong, Shi Qing-yun, Cheng Ming-de Tak. Thenumber of space in the mathematical morphology -theory and application [J]. Beijing: Science Press,1997,429 - 445.[7] ZHAO Chun—hui, SUN Sheng-he, HUIJun-ying.《Optimization of soft morphological filtersby simulated annealing[C]》.Beijing:<strong>Proceedings</strong> ofICSP.2000.[8] T.Chen,Q.H.Wu, R.Rahmani-Torkaman, J.Hughes.A pseudo top-hat mathematical morphologicalapproach to edge detection in dark regions[J]. PatternRecognition,2002,35: 199 - 210.[9] Cenzo A D, Strip mode processing of spotlightsynthetic aperture radar data[J]. IEEE Trans onAerospace and Electronic Systems, 1988, 24(3):225-230.edge detection algorithm on the basis, in turn to improve theabove algorithms, and put forward a modified soft morphologicaledge detection algorithm, which can effectively detect imageedge, while restrain the impact of noise, which has strongrobustness. Additionally based on actual requirements, choosedifferent structural elements to achieve detection of differentpurposes with flexibility. It should be pointed out, however, forthe edge detection of image noise, performance enhancementbased on the cost of complexity, how to reduce the amountcalculated to meet the requirements of real-time while effectivelysuppressing noise and edge detection, will to be in further study.Reference[1] Ruan Qiu-qi, Digital Image Processing [M]. Beijing:Electronic Industry Press, 2001.[2] FU Yong-qing, WANG Yong-sheng. A mathematicalmorphology-based edge detection algorithm forgray-scale images [J]. Journal of Harbin EngineeringUniversity, 2005,26 (5) :685 - 687.[3] Chen Yang and other 《graphics programming andimage processing》, Xidian University Press, 2003.[4] Gong Wei. Figures space mathematical morphology -theory and application [M]. Beijing: Science Press,1997.[5] Huang Feng-gang ,Yang Guo, Song Ke-ou. Flexible137


3D TOOTH RECONSTRUCTIONWITH MULTIPLE B-SPLINE SURFACES THROUGHLINEAR LEAST-SQUARES FITTINGNailiang Zhao † and Weiyin Ma ‡† Institute of Graphics and Image, Hangzhou Dianzi University,Hangzhou, Zhejiang, P.R. China 310018znl@hdu.edu.cn‡ Department of Manufacturing Engineering and Engineering Management, City University of Hong Kong,Kowloon, Hong Kong SAR, Chinamewma@cityu.edu.hkAbstractGeometric modelling of tooth has a great value in medicalapplications. Common 3D models of tooth are usuallydiscrete grid meshes based on the image slice sequence,which is not convenient for further mathematical processing.In this paper, based on the unorganized points from a tooth,we obtain a 3D model of G1-connected B-spline surfacesthrough data segmentation and parameterization. Theequations of multiple B-spline surfaces fitting are solved byleast-squares (LSQ) method. Finally, some examples of toothsurface reconstruction are provided.Keywords: Tooth reconstruction; B-spline surface; G1continuity; Linear least-squares fitting1 IntroductionAdvanced computer technology has been widely used inmedical treatment. Various medical equipments, such as CTand MRI, have limitations in using 2D planar images fordisease diagnosis and treatment. 3D reconstruction of humanorgan has been developed rapidly in recent years, includingeducation, simulations and treatment [1]. The morphologicalcharacteristics and physiological functions of teeth are veryimportant in dental sciences. In practice, doctors may sufferfrom somewhat inconvenience on dental treatment, plasticand even communication with patients if only documents andpictures are used [2]. Therefore, it is an urgent requirement toreconstruct virtual 3D teeth from human teeth.Buchaillard et al. presented a method to reconstruct the 3Dshape of a tooth based on statistical models [3]. Liao et al. usethe patient CT volume and a 3D geometric prior model toproduce a “best-fit” patient specific 3D model of the wholetooth using thin-plate splines warp [4]. Wang et al. proposeda Special Marching Cubes (SMC) algorithm to construct 3Dtooth based on isosurface construction [5]. Bors et al.proposed a binary morphological morphing algorithm ofreconstructing the shape of various teeth from slices [6]. Toour knowledge, little work focuses on multiple B-splinesurfaces for reconstruction 3D tooth.Among the publications in multiple B-spline surfacemodelling, Milroy et al. proposed a procedure for achievingapproximate global G1 continuity [7]. Another approach wasdeveloped by Eck and Hoppe for automatic construction of B-spline surfaces from unorganized points with exact G1continuity [8]. However, the surfaces model therein isactually composed of Bézier patches rather than B-splines.Ma and Zhao present a method for fitting multiple B-splinesurfaces with Catmull-Clark subdivision surfaces forextraordinary corner patches using pure linear least squaresfitting [9]. Based on the G1 continuity of two adjacent bicubicB-spline surfaces with single interior knots, Shi et al.discussed the reconstruction of convergent G1 smooth B-spline surfaces [10].This paper develops an approach to reconstruction 3D toothfrom a set of measured points. The principal contributions ofthe paper and major differences between our approach andothers are summarized as follows:• B-spline surfaces: they are widely used by most of thecommercial medical software systems and are the de-factostandards in the CAD/CAM industry.• Unorganized points: it is easy to construct arbitrarytopology from either 2D slice images or 3D lasermeasured data.• Efficiency: the final observation system of multiplesurface fitting can be represented as a linear LSQprocedure.• Continuity: perfect G1 can be achieved by using linearconnecting functions.138


2 Theory of G1 continuity conditions formultiple connected B-spline surfaceslarger than four, is called an extraordinary corner. Otherwise,it is called a regular corner.We first define two B-spline surfaces P ( u,v)and Q ( u,v)⎧n n⎪P(u,v)= ∑∑Bi,k ( u)⋅ Bi= 1j=1⎨n n⎪Q(u,v)= ∑∑Bi,k ( u)⋅ B⎪⎩i= 1j=1j,kj,k( v)⋅ Pij( v)⋅ QWithout lost of generality, we assume that the two surfacesshare the same order and the same set of knots for the u -directions and v -directions. For P ( u,v), k is the order of the3surface along two directions, and Pij ∈ R are then × n control points. { u i } and { v i } i = 1,2,L , n + k are thetwo sequences of knots with u k = v k = 0,and u n+1 = v n+1 = 1 .The parameters for Q ( u,v)are defined in a similar way.We further assume that the two surfaces will be connectedside by side along the v -directions with P ( 1, v)and Q ( 0, v)being the two boundaries for G 1 connection. It is easy toverify that the sufficient and necessary conditions of G0continuity between P ( u,v)and Q ( u,v)is'n∑i=n−k+2B'k−1i , k ij ∑ i,k ij L,i=1() 1 ⋅ = B ( 0) ⋅ Q , j = 1,2, nij(1)P (2)'P u (1, v), Q u (0, v)and Q v (0, v)denote three tangent vectorfunctions respectively along the common boundaryP ( 1, v)= Q(0,v). One of the sufficient conditions of G1continuity between P ( u,v)and Q ( u,v)is'u'uP (1, v)= α ⋅ Q (0, v)+ ( βv+ γ ) ⋅ Q (0, v)(3)22where α > 0 and β + γ ≠ 0 . We call f (v) = α andg ( v)= β v + γ of equation (3) linear connecting functions.When constructing the topological structure for smoothmultiple surface modelling, we adopt a boundaryrepresentation (B-rep) with quadrilateral domains. Each of thedomains is filled with a B-spline surface defined by equation(1). Each surface boundary or domain boundary may beshared by at most two adjacent surfaces. Any number ofboundaries may however meet at a common corner, asillustrated in Figure 1.To determine the blending functions, we proceed corner bycorner similar to Bézier surfaces widely accepted in literature,called umbrella-shaped configuration. Actually, the commonboundary will degenerate as a Bézier curve under the linearconnecting functions [11]. Let { T } N i i=1be a set of outwardsoriented and anti-clockwise unit tangent vectors at a commoncorner P in Figure 1. The valence N is the number ofboundaries (or surfaces) that meet at the corner P . Anycontrol point with valence N ≠ 4 , i.e., its valence is three or'vQ i+1e i+1 e i Q iT i T i+1 S i+1 PT i-1e i-1Q i-1S i −1Figure 1: Configuration of multiple B-spline surfaces.A set of parameters { α } N i > 0 , βii=1should be assigned to eachof the common corners such thatTi+ 1 = − Ti−αiTi1 + β ii = 1,2,..., NmodwithFor the i -th common boundary, the γ i parameter at one endis actually the β i parameter at the other end and, therefore,needs not be addressed separately. Similar to the constructionof multiple Bézier patches, we should carefully define theparameters { α } N i > 0 , βii=1for all common corners withoutconflict and degeneracy at each corner and with twistcompatibility for all corners. In this paper, we adopt asymmetrical solution at internal extraordinary corners and anunsymmetrical solution at internal regular corners [12].⎧αi = 1, βi= λ = 2cos(2π/ N)⎪⎪i = 1,2, L,N when N ≠ 4⎨α1= α2= α3= α4= 1,⎪β= = 0, = − =⎪1 β3β2β4λ⎪⎩λ ≠ 0when N = 4} N i 0 i i=1Once the parameters { α > , β of all corners have beendefined, the parameters of linear connecting functions inequation (3) of all common boundaries are also determined.3 Strategies for tooth reconstruction withmultiple B-spline surfaces3A set of unorganized points { P } m i ∈ Ri=1are measured fromthe continuous surface area of a human tooth, andS ( u,v)is a set of smooth connected B-spline surfaces{ } s j j= 1that we will approximate the tooth shape. Generally threesteps are needed to obtain the fitted B-spline surfaces.N(4)(5)139


3.1 Topological structure and base surface definitionA quadrilateral topological structure of the tooth is first3interactively defined from the point clouds { P } m i ∈ Ri=1, asillustrated in Figure 2. The four boundaries of eachquadrilateral domain are represented as B-spline curves andare defined by fitting through a limited number ofinteractively selected sample points.After topological structure modelling, a set of approximate B-spline surfaces { S } s j ( u,v)are created as base surfaces forj=1the partition and parameterization of the measured points, asillustrated in Figure 3. The base surfaces are actually theinitial approximation of the final fitted B-spline surfaces.These base surfaces are easily created from the four boundarycurves with or without additional uni-directional interiorcurves for accurate approximation.3.2 Data partition and parameterizationData partition and parameterization are realized through aprojection process using the base surfaces approach reported3in [9]. After projecting all measured points { P ∈ ontothe multiple base surfaces { Sj( u,v)} s j=1} m i Ri=1, each point P i in thepoint clouds will be distributed to an individual topologicaldomain and associated with an index j of a base surface with{ j | 1 ≤ j ≤ s}. Two location parameters ( u i , vi) of P irelative to the corresponding base surface are alsosimultaneously obtained. After all the measured data areprocessed, one obtains a set of location parameters{ } } m s j( uij, vij)i=1 j{ }m j{ } s ij i j=1 =1= 1corresponding to the partitioned data pointP , where m j is the number of measured pointswhich belong to base surface3.3 Linear least-squares fittingS j withs∑ mj=1j =After introducing the measured data { ij }imlocation parameters { } } s j( uij, vij)i=1 j=1spline surfaces { ( )} s j u,vj=1m .m{ } s jjP and the=1 =1into the multiple B-S defined similar to equation (1),one obtains the following sets of observation equationX j , Y j andB X = X B Y Y , B Z = Zjjj , j j = j j j j (6)Z j represent the collection of x -, y - and z -coordinates of control points of the j -th surface,X j ,and Z j represent the coordinates of the measured pointsfalling on the j -th surface, and B j is the j -th matrixdefined byY j⎡ B1j ( ⋅1j)⎢⎢B1j ( ⋅2j )B j =⎢⎢⎢B( ⋅ )⎣1 j m j jB ( ⋅ )BB2 j2 j2 j( ⋅M( ⋅1 j2 j)m j j)LLLBnj( ⋅1j)⎤B ⋅⎥nj ( 2 j )⎥⎥⎥Bnj( ⋅mj )j ⎥⎦where ⋅ ) = ( u , v ) for i = ,2,...,m , j = 1,2,...,s .( ij ij ij1 jThe s observation system of equation (6) can be collectivelyrepresented by the following general observation equations(7)BX = X, BY = Y,BZ = Z(8)where B { B , B 2 , , }= diag 1 L B s , and X , Y and Z representthe coordinates of control points of all B-spline surfaces, X ,Y and Z represent the coordinates of all points measuredfrom tooth.For x -component, the integrated observation equations areLSQ : BX = X S.T.: AX = 0 (9)0 1where AX = 0 is the collection of all G and G continuityconditions in equations (2) and (3). After eliminating somevariables of control points X through the variablesubstitution with AX = 0 . One obtains the followingsimplified observation system for smooth multiple surfacefitting.~ ~B X = X(10)and X ~ can be solved by linear least-squares method withoutany constrains.~ T ~ − ~( B B) ( B TX)~ 1X = (11)Then all X can be obtained through the reverse variablesubstitution with 0 AX = .4 Results and discussionsThe algorithms discussed in this paper are implemented in anin-house developed modelling environment built uponOpenGL on Windows XP (Pentium IV 1.7GHz, 512MBRAM). The final fitted 3D tooth is represented as a B-repsurface/solid model.The original measured data of the tooth in this paper include9337 unorganized points in Figure 2. The topological modelof the tooth contains 11 inter-connected quadrilateral surfacedomains as shown in Figure 2. All initial base surfaces are0shown in Figure 3 with G continuity. There are in total 11common corner with 8 extraordinary corners ofvalence N = 3 and 4 regular corners of valence N = 4 .The matrix ( B ~ TB~ )in equation (11) should be orthogonal andinvertible if there are sufficient numbers of samples points areused. Precisely, it requires that there should be at least onesample point located within the definition domain for each of140


the basis functions of B-spline. With too less number ofsample points, there might be a danger of unstable solution(almost not invertible), while with too many points, thecomputing time will be longer. For practical fitting, we needto use "just enough points" for achieving stable andrepresentative results. After data partition andparameterization, we only adopt 2816 points to fit the finalsurfaces for this example.All B-spline surfaces shown in the Figure 4 and 5 are definedwith bi-quartic splines and uniform knots with k = 5 , and thenumber of control points for each surface is n ×n = 8 × 8 = 64in Figure 4. Each surface has 4 × 4 = 16 patches. We resizeall measured points into the interval [-1,1] before processing,then, the maximum, average, minimum and standarddeviations of the fitting errors of reconstructed 3D tooth are0.0722, 0.0158, 0.0003 and 0.0002, respectively.Figure 2: Measured pointsand quadrilateral topologicalstructure.Figure 4: Fitted multiple B-spline surfaces with theircontrol points.5 ConclusionsFigure 3: Base surfaces fordata partition andparameterization.Figure 5: Reconstructed 3Dtooth with smooth connectedB-spline surfaces.This paper presents a method for reconstructing 3D tooth withsmoothly connected multiple B-spline surfaces through linearleast-squares fitting. A quadrilateral topological structure isfirst defined. The control mesh of the multiple surfaces is thenautomatically constructed through linear least squares fitting.The final fitted surfaces are smoothly connected with perfectG1, it is easy to process, such as prototype of artificial toothin advance. Because the topological structures of one kind oftooth are similar, this method is quite effective and suitablefor medical applications.AcknowledgementsThe work presented in this paper is sponsored by CityUniversity of Hong Kong through a Strategic Research Grant(#7001928) and Qianjiang Talent Project of ZhejiangProvince, China (#2007R10011).References[1] Muraki Shigeru and Kita Yasuyo. “A survey of medicalapplications of 3D image analysis and computergraphics”, Systems and Computers in Japan, vol. 37(1),pp. 13-46, January (2006).[2] El-Bialy and Ahmed. “Towards a complete computerdental treatment system”, <strong>Proceedings</strong> of the CairoInternational Biomedical Engineering Conference, pp.1-8, (2008).[3] Stéphanie I. Buchaillard et al. “3D statistical models fortooth surface reconstruction”, Computers in Biology andMedicine, vol. 37(10), pp. 1461-1471, October (2007).[4] Sheng-hui Liao, Ruo-feng Tong and Jin-xiang Dong.“3D Whole Tooth Model from CT Volume using Thin-Plate Splines”, <strong>Proceedings</strong> of the Ninth InternationalConference on Computer Supported Cooperative Workin Design, vol. 1, pp. 600- 604, (2005).[5] Hongjian Wang, Fen Luo and Jianshan Jiang. “3DReconstruction of CT Images Based on IsosurfaceConstruction”, <strong>Proceedings</strong> of the InternationalConference on Intelligent Computation Technology andAutomation, vol. 2, pp. 55 - 59, October, (2008).[6] A. G. Bors, L. Kechagias and I. Pitas. “Binarymorphological shape-based interpolation applied to 3-Dtooth reconstruction”, IEEE Transactions on MedicalImaging, vol. 21(2), pp. 100-108, (2002).[7] M. J. Milroy et al. “G1 continuity of B-spline surfacepatches in reverse engineering”, Computer Aided Design,vol. 27, pp. 471-478, (1995).[8] M. Eck, and H. Hoppe. “Automatic reconstruction of B-spline surfaces of arbitrary topology type”, ComputerGraphics, vol. 30, pp. 325-334, (1996).[9] Weiyin Ma, Nailiang Zhao. “Smooth multiple B-splinesurface fitting with Catmull–Clark surfaces forextraordinary corner patches”, The Visual Computer, vol.18(7), pp. 415-436 (2002).[10] Xiquan Shi, Tianjun Wang and Piqiang Yu. “A practicalconstruction of G1 smooth biquintic B-spline surfacesover arbitrary topology”, Computer Aided Design, vol.36, pp. 413-424, (2004).[11] Nailiang Zhao and Weiyin Ma. “Properties of G1continuity conditions between two B-spline surfaces”,Advances in Computer Graphics, in Lecture Notes onComputer Science, vol. 4035, pp. 743-752, (2006).[12] Weiyin Ma and Nailiang Zhao. “Exact G1 ContinuityConditions for B-Spline Surfaces with Applications forMultiple Surface Fitting”, <strong>Proceedings</strong> of theInternational Conference on Manufacturing Automation,pp. 47-56, December, (2002).141


Automatic Recognition of Head Movement Gestures in Sign Language SentencesDaniel Kelly, Jane Reilly Delannoy, John Mc Donald, Charles MarkhamComputer Science Department, National University of Ireland, Maynooth, Irelanddankelly@cs.nuim.ieAbstract—A novel system for the recognition of head movementgestures used to convey non-manual information insign language is presented. We propose a framework forrecognizing a set of head movement gestures and identifyinghead movements outside of this set. Experiments show ourproposed system is capable of classifying three different headmovement gestures and identifying 15 other head movements asmovements which are outside of the training set. In this paperwe perform experiments to investigate the best feature vectorsfor discriminating between positive a negative head movementgestures and a ROC analysis of the systems classificationsperformance showed an area under the curve measurementof 0.936 for the best performing feature vector.Keywords-Sign Language, Non Manual Signals, HMMI. INTRODUCTIONSign Language is a form of non-verbal communicationwhere information is mainly conveyed through hand gestures.Since sign language communication is multimodal, itinvolves not only hand gestures (i.e., manual signing) butalso non-manual signals. Non-manual signal are conveyedthrough facial expressions, head movements, body posturesand torso movements. Recognizing Sign Language communicationtherefore requires simultaneous observation of manualand non-manual signals and their precise synchronizationand signal integration. Thus understanding sign languageinvolves research in areas of face tracking, facial expressionrecognition, human motion analysis and gesture recognition.Over the past number of years there has been a significantamount of research investigating each of these non-manualsignals attempting to quantify their individual importance.Works such as [1], [2], [3] focused on the role of head poseand body movement in sign language. These researchersfound evidence which strongly linked head tilts and forwardsmovements to questions, or affirmations. The analysis offacial expressions for the interpretation of sign languagehas also received a significant amount of interest [4], [5].Computer-based approaches which model facial movementusing Active Appearance Models (AAMs) have been proposed[6], [7], [8].The development of a system combining manual and nonmanualsignals is a non-trivial task [9]. This is demonstratedby the limited amount of work dealing with the recognitionof multimodal communication channels in sign language.Ma et al [10] used Hidden Markov Models (HMMs) tomodel multimodal information in sign language, but lipmotion was the only non-manual signal used. Their workwas based on the assumption that the information portrayedby the lip movement directly coincided with that of themanual signs. While this is a valid assumption for mouthing,it cannot be generalised to other non-manual signals as theyoften span multiple manual signs and thus should be treatedindependently.In this paper we evaluate techniques for the automaticrecognition head movement gestures used to convey nonmanualinformation in Irish Sign Language (ISL) sentences.We propose a framework for the automatic recognitionof head movement gestures, building on the techniquesproposed by Kelly et al [11] who use a HMM thresholdmodel system to recognize manual signals.II. FEATURE EXTRACTIONThe focus of this work is to evaluate the HMM thresholdmodel framework as a system for recognizing head movementgestures. For completeness, we briefly describe thefeature tracking techniques used, though we do not considerit to be the novel part of our work.Figure 1. Extracted Features from ImageFace and eye positions are used as features for headmovement recognition. Face and eye detection is carried outusing a cascade of boosted classifiers working with haar-likefeatures proposed by Viola and Jones [12]. A set of publicdomain classifiers [13], for the face, left eye and right eye,are used in conjunction with the OpenCV implementationof the haar cascade object detection algorithm. We definethe raw features extracted from each image as follows; faceposition (F C x , F C y ), left eye position (LE x , LE y ) andright eye position (RE x , RE y ).III. HIDDEN MARKOV MODELSHidden Markov Models (HMMs) are a type of statisticalmodel and can model spatiotemporal information in a naturalway. HMMs have efficient algorithms for learning andrecognition, such as the Baum-Welch algorithm and Viterbi142


search algorithm [14]. A HMM is a collection of statesconnected by transitions. Each transition (or time step) has apair of probabilities: a transition probability (the probabilityof taking a particular transition to a particular state) and anoutput probability (the probability of emitting a particularoutput symbol from a given state). We use the compactnotation λ = {A, B, π} to indicate the complete parameterset of the model where A is a matrix storing transitionsprobabilities and a ij denotes the probability of making atransition between states s i and s j . B is a matrix storingoutput probabilities for each state and π is a vector storinginitial state probabilities. HMMs can use either a set ofdiscrete observation symbols or they can be extended forcontinuous observations signals. In this work we use continuousmultidimensional observation probabilities calculatedfrom a multivariate probability density function.To represent a gesture sequence such that it can bemodeled by a HMM, the gesture sequence must be definedas a set of observations. An observation O t , is definedas an observation vector made at time t, where O t ={o 1 , o 2 , ..., o M } and M is the dimension of the observationvector. A particular gesture sequence is then defined asΘ = {O 1 , O 2 , ..., O T }. To calculate the probability of aspecific observation O t , we implement probability densityfunction of an M-dimensional multivariate gaussian (seeEquation 1). Where µ is the mean vector and Σ is thecovariance matrix.ℵ(O t ; µ, Σ) = (2π) − N 2 |Σ| − 1 2 exp(− 1 2 (Ot−µ)T Σ −1 (O t−µ)) (1)A. HMM Threshold ModelLee and Kim [15] proposed a HMM threshold modelto handle non-gesture patterns. The threshold model wasimplemented to calculate the likelihood threshold of aninput pattern and provide a confirmation mechanism forprovisionally matched gesture patterns. We build on thiswork carried out by Lee and Kim to create a frameworkfor calculating a probability distribution of head movementinput gesture using continuous multidimensional observations.The computed probability distribution will includeprobability estimates for each pre-trained sign as well asa probability estimate that the input sign is a non headmovement gesture.In general, a HMM recognition system will choose amodel with the best likelihood as the recognized gestureif the likelihood is higher than a predefined threshold.However, this simple likelihood threshold often does notwork, thus, Lee and Kim proposed a dynamic thresholdmodel to define the threshold of a given gesture sequence.A property of the left-right HMM model implies that aself transition of a state represents a particular segment ofa target gesture and the outgoing state transition representsa sequential progression of the segments within a gesturesequence. With this property in mind, an ergodic model, withthe states copied from all gesture models in the system, canbe constructed as shown in Figure III-A and III-A, wheredotted lines in Figure III-A denote null transitions (i.e. noobservations occur between transitions).Figure 2.Dedicated Gesture ModelsFigure 3. Threshold ModelStates are copied such that output observation probabilitiesand self transition probabilities are kept the same, butall outgoing transition probabilities are equally assigned asdefined in Equation 2 where N is the number of statesexcluding the start and end states (The start and end statesproduce no observations).a ij = 1 − a ij, ∀j, i ≠ j, (2)N − 1As each state represents a subpattern of a pre-trainedgesture, constructing the threshold model as an ergodicstructure makes it match well with all patterns generatedby combining any of the gesture sub-patterns in any order.The likelihood of the threshold model, given a valid gesturepattern, would be smaller than that of the dedicated gesturemodel because of the reduced outgoing transition probabilities.However, the likelihood of the threshold model, givenan arbitrary combination of gesture sub-patterns, would behigher than that of any of the gesture models, thus thethreshold model, denoted as λ, can be used as a non headmovement gesture measure.B. HMM Threshold Model & Gesture RecognitionKelly et al [11] expand on the work of Lee and Kim[15] to develop a HMM threshold model system whichmodels continuous multidimensional sign language observationswithin a parallel HMM network to recognize twohand signs and identify movement epenthesis. In this paper,we expand on the work of Kelly et al to create a frameworkfor recognizing head movement gestures.For a network of HMMs Λ = {λ 1 , λ 2 , ..., λ C }, where λ cis a dedicated gesture HMM used to calculate the likelihoodthat the input gesture is belonging to gesture class c, a single143


threshold model λ is created to calculate the likelihoodthreshold for each of the dedicated gesture HMMs.IV. NON MANUAL SIGNAL RECOGNITIONWhile hand gestures do play central grammatical roles,movements of the head, torso and face are used to expresscertain aspects of ISL. In this work we will focus on asingle non-manual signal, the head movement, to evaluateour techniques when recognizing non-manual features.A. Model TrainingOur system initializes and trains a dedicated HMM foreach head movement gesture to be recognized. In thiswork we evaluate our techniques using three different headmovement gestures; a left head movement, a right headmovement and a left-forward movement. A visual exampleof a signer performing each of the three different headmovement gesture is in shown in Figure IV-A.Figure 4. Example of the three different head movement gestures thesystem was tested on (a) Right Movement (b) Left Movement (c) LeftForward MovementTo train the head movement HMMs, we recorded 18different videos of a fluent ISL signer performing the headmovements naturally within full sign language sentences.Six videos where recorded for each head movement gesture.Each head movement HMM λ H i (where 0 < i < I and Iis the total number of head gestures) was then trained onthe observation sequences extracted from the correspondingvideos.The start and end point of each of the head movementgestures were labeled, the observation sequences Θ i wereextracted and each HMM was then trained using the iterativeHMM training model proposed by Kelly et al [11]. A HMMthreshold model, λ H is then created using the network oftrained HMMs λ H i (where 0 < i < I). The set of HMMs,to recognize the I pre-trained head movement gestures, isthen denoted as Λ H = {λ H 1 , λ H 2 , ..., λ H I , λH }.B. Head Movement RecognitionGiven an unknown sequence of head movement observationsΘ H , the goal is to accurately classify the headmovement as one of the I trained gestures or as a movementwhich is not a trained gesture. To classify the observations,the Viterbi algorithm is run on each model given theunknown observation sequences Θ H , calculating the mostlikely state paths through each model i. The likelihoodsof each state path, which we denote as P (Θ H |λ H i ), arealso calculated. The sequence of observations can then beclassified as i if Equation 3 evaluates to be true.P (Θ H |λ H i ) ≥ Ψ H i (3)Ψ H i = P (Θ H |λ H )Γ H i (4)Where Γ H i is a constant scalar value used to tune thesensitivity of the system to head movement which the systemwas not trained on.C. ExperimentsAn accurate head movement gesture recognition systemmust be able to discriminate between positive and negativehead movement gesture samples, therefore, we performa set of experiments to find the best feature set whendiscriminating between isolated positive and negative headgestures.To test the discriminative performance of different featurevectors, we recorded an additional 7 videos for each headgesture (21 in total), where a fluent ISL signer performedthe head movement gestures within different sign languagesentences. The start and end points of the head gestureswere then labeled and isolated observation sequences Θ τ iwere extracted. An additional set of 15 other head gesturesequence, outside of the training set, were also labeled inthe video sequences to test the performance of the systemwhen identifying negative gestures.The classification of a gesture is based on a comparisonof a weighted threshold model likelihood with the weightdenoted as Γ H i . In our ROC analysis of the system, we varythe weight, Γ H i , over the range 0 ≤ ΓH i ≤ 1 and then createa confusion matrix for each of the weights.To evaluate the performance of different features, weperformed a ROC analysis on the models generated fromthe different feature combinations and calculated the areaunder the curve (AUC) for each feature vector model. TableI shows the AUC measurement of four different featureswhich were evaluated during our experiments. To calculatethe directional vector of the head, (VxH , VyH ), we used themid point between the eyes and calculated the directionthe midpoint moved from frame to frame. We used asliding window to average the directional vector and in ourexperiments we evaluated the best performing window sizefor each feature vector. Although we evaluated each feature144


vector with a range of different window sizes, we report onlythe best performing window sizes for each feature vector inTable I.Table IAUC MEASUREMENTS FOR DIFFERENT FEATURE COMBINATIONSFeatures Window ROCSize AUCF 1 - Unit Direction Vector ( ˆV x H, ˆV y H) 6 0.821F 2 - Direction Vector (Vx H, V y H ) 12 0.936F 3 - Unit Direction Vector ( ˆV x H , ˆV y H )+ Angle Eyes (θ eyes) 6 0.863F 4 - Direction Vector (Vx H, V y H)+ Angle Eyes (θ eyes) 6 0.868V. CONCLUSIONIn this paper we have discussed current research in thearea of automatic recognition of non-manual signals used insign language. The development of a system to recognizenon-manual signals is a non-trivial task and this is demonstratedby the limited number of works dealing with nonmanualsignals in the context of sign language sentences.We have presented a framework for recognizing headmovement gestures used to convey non-manual informationin sign language sentences. We expanded the HMMthreshold model technique, proposed by Lee and Kim [15],to develop a system which models continuous multidimensionalhead movement observations within a HMM networkto recognize head movements and identify head movementgestures which the system was not trained on. We performexperiments to investigate possible observation vectorswhich best discriminate between positive and negative headmovement gestures samples. A ROC analysis of differentobservation vectors showed that the best performing vector,with an AUC measurement of 0.936, was a two dimensionalvector describing the movement of the eye midpoint withina sliding window averaged over 12 frames.The significance of the research presented in this paperis that we have developed a general technique for recognisinghead movement gestures. With a view to developingan automatic sign language recognition system, identifyingnon-manual signals such as head movement is an importanttask. In this paper we have demonstrated that our techniquesare capable of identifying typical head movement gestureswhich occur in sign language sentences, therefore enableus to determine whether or not a question was posed bythe signer. Future work will involve incorporating thesetechniques into a wider framework for automatic recognitionof multi-modal continuous sign language.ACKNOWLEDGMENTThe Authors would like to acknowledge the financial supportof the Irish Research Council for Science, Engineeringand Technology (IRCSET).REFERENCES[1] B. Bahan, “Nonmanual realisation of agreement in americansign language,” Ph.D. dissertation, University of California,Berkely, 1996.[2] E. van der Kooij, O. Crasborn, and W. Emmerik, “Explainingprosodic body leans in sign language of the netherlands:Pragmatics required,” Journal of Pragmatics, vol. 38, 2006,prosody and Pragmatics.[3] C. Baker-Shenk, “Factors affecting the form of questionsignals in asl,” Diversity and Diachrony, 1986.[4] R. B. Grossman and J. Kegl, “To capture a face: A novel techniquefor the analysis and quantification of facial expressionsin american sign language,” pp. p273–305, 2006.[5] R. Grossman and J. Kegl, “Moving faces: Categorization ofdynamic facial expressions in american sign language by deafand hearing participants,” Journal of Nonverbal Behavior,vol. 31, no. 1, pp. 23–38, 2007.[6] U. von Agris, M. Knorr, and K.-F. Kraiss, “The significanceof facial features for automatic sign language recognition,”pp. 1–6, 2008.[7] U. von Agris, J. Zieren, U. Canzler, B. Bauer, and K.-F. Kraiss, “Recent developments in visual sign languagerecognition,” Universal Access in the Information Society,vol. 6, no. 4, pp. 323–362, 2008.[8] C. Vogler and S. Goldenstein, “Facial movement analysis inasl,” Universal Access in the Information Society, vol. 6, no. 4,pp. 363–374, 2008.[9] S. C., W. Ong, and S. Ranganath, “Automatic sign languageanalysis: A survey and the future beyond lexical meaning,”IEEE Trans. PAMI, vol. 27, no. 6, pp. 873–891, 2005.[10] J. Ma, W. Gao, and R. Wang, “A parallel multistream modelfor integration of sign language recognition and lip motion,”in ICMI ’00: Proc of the 3rd Intl Conf on Adv in MultimodalInterfaces, 2000, pp. 582–589.[11] D. Kelly, J. McDonald, and C. Markham, “Recognizingspatiotemporal gestures and movement epenthesis in signlanguage,” in IMVIP <strong>2009</strong>, <strong>2009</strong>.[12] P. Viola and M. Jones, “Rapid object detection using a boostedcascade of simple features,” CVPR, IEEE, vol. 1, p. 511, 2001.[13] L. A.-C. M. Castrillon-Santana, O. Deniz-Suarez andJ. Lorenzo-Navarro, “Performance evaluation of public domainhaar detectors for face and facial feature detection,”VISAPP 2008, 2008.[14] L. Rabiner, “A tutorial on hidden markov models and selectedapplications in speech recognition,” <strong>Proceedings</strong> of the IEEE,vol. 77, no. 2, pp. 257–286, Feb 1989.[15] H. K. Lee and J. H. Kim, “An hmm-based threshold model approachfor gesture recognition,” IEEE PAMI, vol. 21, no. 10,pp. 961–973, 1999.145


Dirt and Sparkle Detection for Film SequencesPeter Gaughran, Susan Bergin, Ronan ReillyDepartment of Computer Science, National University of Ireland, MaynoothMaynooth, Co. Kildare, Irelandpeter.gaughran@nuim.iesusan.bergin@nuim.ieronan.reilly@nuim.ieAbstract— Until recently, filming has been an analogueprocess; it requires a mechanical process to record and view, andthe source material itself is prone to decay & abrasion [1]. Film isexpensive to store, and prohibitively expensive to restore. Allfootage - historical, documentary or entertainment – maycompletely degrade over time. While many archival films stocksare currently being scanned and further damage thus prevented,the digital copies are far from the quality of the original. Thetypes of aberrations found are varied, from frame jitter and linescratches to dirt and sparkle. It is the detection of the latter two(which are frame based abnormalities) that will be examinedhere.Sparkle can occur either chemically, over time, ormechanically, through wear of repeated viewing. Dirt,however, is simply material that has stuck to the frame, as inFigure 2.Keywords— Dirt, sparkle, detection, machine vision, blockmatchingI. MOTIVATIONTraditionally, when restoring footage, each frame of amotion picture reel must be cleaned carefully by experts, and,for an average feature length of, for example, 2 hours or7,200 seconds at 24 frames a second, this results inapproximately 172,800 frames that have to be cleaned by hand.Aside from the mechanical method of cleaning, particularareas must also be identified, cleaned if dirt is present, or‘filled in’ if sparkle found. Sparkle occurs when the filmsurface is scratched or scraped away, usually revealing a lightsurface (silver nitrate) underneath. It manifests as a smallwhite or lightly coloured blotch in a frame of footage, seeFigure 1.Fig. 1. Examples of sparkle encircled in the frame above. Note that in thepreceding and following frames, sparkle will not be present in the samelocations.Fig. 2. Examples of dirt are circled above. Observe no sparkle is present atthe points labeled in Fig. 1.Both are often referred to simply as blotches. Given thetime consuming nature of restoration, it is extremelyexpensive & labour intensive. In the digital era, althoughrequiring less in the way of chemicals and physical storage,restoration is very similar to the traditional means. Once thesource material has been scanned (usually using a 4K or 8Kscanner) the frames are examined individually and dirt &sparkle identified, before being manually removed. Theprimary advantage may be said to be convenience. Digitalautomatic detection has been attempted, however.II. PREVIOUS DIRT AND SPARKLE DETECTIONIndustrial software exists (such as AlgoSoft, Amped andDIAMANT) – but the means of detection and success rate areunpublished; however, peer assessment & cinematic critiquehas not been favourable [2]. Previous academic researchincludes detection of dirt and sparkle by means of motionestimation and 3D autoregressive modelling – in particular,the JOMBADI (Joint Model BAsed Detection andInterpolation) algorithm [3]. The JOMBADI approachattempts to combine blotch detection and repair in a singlestep; a statistical model of the frame is created and motionvectors randomly adjusted until a predicted (reconstructed)146


frame is reached (based on either prediction error ormaximum number of iterations). This results in either veryhigh computational loads and/or lack of accuracy. GlobalMotion Segmentation for blotch detection has also beenattempted – using this technique, blotches are detected as‘areas’ of pixels that do not adhere to any parametric globalinterframe transformation model [4]. Being exhaustive, theresult is also a computational load, and is subject to theaccuracies, inaccuracies and possible contradictions of thevarious transformation models employed. Czúni et al haveimplemented DIMORF - a neural network for semi automaticdetection coupled with an XML database to minimise falsepositives (by meta tagging incorrect finds in a single frame, allother such instances can be ignored if found in subsequentframes) [5]. As such, DIMORF aspires more as a semiautomaticdetection and indexing software system. Regardlessof the means, all approaches use pixel intensities as the inputdata, and most of the systems to date (JOMBADI included)use block matching techniques.III. BLOCK MATCHING ALGORITHMSEmployed extensively in the domain of video encoding,block matching generally uses motion estimated from thecurrent frame with respect to the previous frame. A motioncompensated image is then created from blocks taken from theprevious frame. Each frame is divided into ‘macro blocks’,which are then compared with corresponding block andadjacent neighbours in the previous frame. A vector is thencreated that stipulates the movement of a given macro blockfrom one location to another. The search area (of where themacro block should be located) is constrained by up to ppixels of the previous frame, see Figure 3.suitability for potential use in dirt/sparkle detection –previously, only a modified version of the exhaustive searchblock matching has been used for blotch detection [2]. Thesealgorithms were fully implemented in Matlab, and includeexhaustive search, three step search, simple and efficient threestep search, new three step search, four step search, diamondsearch, and adaptive rood pattern search.B. ResultsAs an initial means of comparison, each algorithm and theirrespective number of computations per frame were plotted,see Figure 4. In all cases, the macroblock size was set to 16,and the search parameter p was 7, as per the recommendedvalues [7]. Another test was then completed with the presenceof an artificial blotch at frame 15. Except for the adaptive roodpattern search, none of the other algorithms’ output changedto reflect the presence of a break or discontinuity in motionestimation for a single frame, as can be seen in Figure 5.Adaptive rood pattern search assumes that general motion in aframe is usually coherent, i.e., it attempts to anticipate thedirection of the motion vectors; as the others do not use thistechnique, the amount of computation is unaffected. Theadaptive rood pattern search alone was then run on a sample32 frame sequence, with genuine examples of dirt & sparkledigitally copied and placed at frames 5, 10, 15 and 20.However, the resultant graphs from both runs were identical,as in Figure 6. Only when the macroblock size was altered (to8) and the search parameter p dropped to 4 were useful resultsobtained, thus indicating that the detection is size andtherefore parameter dependent, see Figure 7. The encircledplateaus in Figure 7 that do not exist in Figure 6 represent theadaptive rood’s attempt to find the closest match; finding suchplateaus indicates the location of a potential blotch.IV. FUTURE WORKFurther analysis and alteration of adaptive rood patternsearch is required - in particular macroblock & searchparameter size - as well as the potential for implementingdetection and eventual reconstruction of the frames viaparallel means. Statistical or machine learning classifiers maybe applied to suspected blotches to improve classification.Fig. 3. A sample macroblock search space. The larger p becomes, the morecomputationally expensive the process is.Usually the macro block is taken as a square of side 16pixels, and the search parameter p is 7 pixels. Compression isthen achieved by means of JPEG encoded difference images -inherently smaller than the full, original frame [6].A. ImplementationThe work completed to date has consisted of implementingseveral block matching algorithms, in order to assess theirACKNOWLEDGMENTThanks to all at the Department of Computer Science, andthe Systems & Networks Department in the Computer Centrein the National University of Ireland, Maynooth.REFERENCES[1] R.A. Harris, “Preservation: Why Are Films and Videos Disappearing?”,American Film Institute, Washington, D.C. Public Hearing, February1993.[2] J. Krebs, “Creating the Video Future”, Sound & Vision Magazine,Nov. 2004.[3] A. C. Kokaram, “Advances in the detection and reconstruction ofblotches in archived film and video”, Digital Restoration of Film andVideo Archives (Ref. No. 2001/049), IEE pages: 71-76147


[4] T. Komatsu, T. Saito, “Detection and Restoration of Film BlotchesUsing Global motion Segementation,” ICIP99 (Vol III, pages: 479-483)[5] L. Czúni, A. Hanis, L Kovács, B. Kránicz, A. Licsár, T. Szirányi, I.Kas, Gy. Kovács, S. Manno, “Digital Motion Picture RestorationSystem for Film Archives (DIMORF)”, SMPTE Motion ImagingJournal, Vol. 113, pp. 170-176, May-June 2004.[6] I.E.G. Richardson, “Video Codec Design”, Ch. 4-6. West Sussex JohnWiley & Sons, Ltd., 2002.[7] Y. Tu, J. Yang, Y. Shen, M. Sun, “Fast variable-size block motionestimation using merging procedure with an adaptive threshold”, ICMEVol. 2 pages: 789-792, July 2003.Fig. 4 - A measure of various block matching techniques, compared on thebasis of number of computations per frame. The sequence was 32 black andwhite frames longFig. 5 - Note the change in adaptive rood at the presence of the blotchFig. 6 - 32 frame sequence output, with macroblock and p size altered.Fig. 7 - 32 frame sequence output, with blotches at the indicated frames148


Section 3CGEOSENSORS149


Lightweight Signal Processing Algorithms forHuman Activity Monitoring using Dual PIR-sensorNodesMuhammad Tahir ∗ , Peter Hung ∗ , Ronan Farrell ∗ , Seán Mcloone ∗ and Tim McCarthy †∗ Institute of Microelectronics and Wireless Systems† National Centre for GeocomputationNational University of Ireland Maynooth, Maynooth, Co. Kildare, IrelandEmail: {mtahir, phung, rfarrell, sean.mcloone}@eeng.nuim.ie, tim.mccarthy@nuim.ieAbstract—A dual Pyroelectric InfraRed (PIR) sensor node isused for human activity monitoring by using simple data processingtechniques. We first point out the limitations of existingapproaches, employing PIR sensors, for activity monitoring. Westudy the spectral characteristics of the sensor data for the casesof varying distance between the sensor and moving object as wellas the speed of the object under observation. The sampled datafrom two PIR sensors, is first processed individually to determinethe activity window size, which is then fed to a simple algorithmto determine direction of motion. We also claim that human countcan be obtained for special scenarios. Preliminary results of ourexperimentation show the effectiveness of the simple algorithmproposed and give us an avenue for estimating more involvedparameters used for speed and localization.Index Terms—Multi-sensor, activity monitoring, data fusion,pyroelectric IR.I. INTRODUCTIONHuman activity monitoring has always been of much importance,because of a large class of applications, ranging fromsurveillance to tracking and from smart environments to navigation.Traditionally, human activity monitoring is performedusing image sensors producing large data volumes resultingin huge data processing overheads. This may be required toextract certain features of interest, for instance, number ofpeople, position, direction and speed of motion [1] to name afew. Although activity monitoring approaches based on visualsensor solutions provide accurate results, they require largeinvestment and significant infrastructure deployment. Contraryto that, a system based on pyroelectric infrared (PIR) sensorsexploit pyroelectricity to detect an object, which is not atthermal equilibrium with its environment [2]. PIR sensors haveseen wide deployments in commercial applications, to detecthuman presence, to trigger security alarms, to control lighting.In addition, these sensors have also found applications inthermal imaging, radiometry, thermometry as well as biometry[3], [4].While a single PIR sensor is widely used for each surveillanceregion in security related applications to detect anintruder [5], multiple PIR sensors are needed for more advancedapplications such as to achieve coverage [6], assistvideo surveillance [7] as well as perform tracking [8]. PIRsensors has been used to differentiate a still person from itsbackground [5]. The authors in [6] have employed four PIRsensors to achieve 360 ◦ coverage while performing humandetection. Since the outputs from all four PIR sensors arefed to the summing amplifier before feeding to the analogto-digitalconverter (ADC), this results in inaccessibility ofindividual sensor outputs to the algorithm. Doing so limits theperformance of the sensor node to only human detection. Avideo surveillance system using multi-modal sensor integrationis proposed in [7], where a camera-based tracking system isintegrated with a wireless PIR sensor network.PIR sensors have also been integrated with other sensingmodalities to achieve lightweight processing. The problem oflocalization in a dynamic environment is considered in [9]by using PIR and ultrasonic sensors simultaneously. Linearregression along with smoothing is used for distance correctionleading to accurate localization. The multi-modal sensor nodedesign in [6] integrates PIR sensors with acoustic and magneticsensors to differentiate among humans, soldiers and vehicles.The idea is based on exploiting multiple sensor modalities toachieve the objective.The task of human monitoring and tracking using PIRsensors can also be implemented in a hierarchical network.This involves the collective actions of sensing modules actingas slaves, a synchronization and error rejection module as amaster and a data fusion module termed as a host, as discussedin [8]. In this particular implementation, the geometric sensormodule is designed with multiple PIR sensors, each equippedwith a Fresnel lens array to obtain a spatially modulated fieldof view. In addition to tracking, PIR sensors can also be used todetect, differentiate and describe human activity. A multimodalsystem using a dual PIR sensor node for direction of motiondetection using a sensor activation sequence is presented in [7][10]. The usage of the polarity of the first pulse produced bythe sensor for determining the direction of motion limits theapplicability of this approach. In [11] the authors have usedPIR enabled sensor nodes with information exchange with abase station to determine the direction and number of people.However, this approach is limited due to the requirement foraccurate time synchronization across the sensor nodes and150


communication overhead involved. Our proposed approachpartially addresses these issues by integrating two PIR sensorsat each sensor node providing accurate timing for the sampleddata from the two PIR sensors and eliminating associatedcommunication overhead.Rest of the paper is organized as follows. In Section IIwe discuss the approaches taken in literature to obtain thebasic set of parameters, leading to an effective human activitymonitoring system. Section III outlines the procedure for dataacquisition and the simple algorithms used for processing thatdata. In Section IV we provide the results for different parametersof interest obtained using simple processing techniquesdiscussed in Section III. Finally, we conclude in Section Vwith some future directions.II. HUMAN ACTIVITY MONITORINGUsually PIR sensors are designed as part of an overall intrusiondetection system, where alarms are activated whenevera PIR output exceeds a predefined threshold. Multiple PIRsensors along with simple signal processing algorithms canbe used to obtain parameters of interest for human activitymonitoring (e.g. direction of motion, speed and distance of theobject and counting these objects to name a few). The first steptowards this objective involves distinguishing each individualobject and determining its direction of motion as it entersthe field-of-view (FOV) of the sensor. The next step involvescounting the number of human beings passing through thesensor FOV and estimating the speed of motion. However,there are two key issues in counting the objects, passing by,and measuring their speed of motion.The first issue is related to counting the number of peoplepassing through the area under observation. There are situationswhere more than one human being, for instance multiplepersons having a conversation and walking parallel to eachother are passing through the FOV of the sensor and are closeenough to one another that their collective PIR sensor outputis almost similar to the case of one person passing. This isbecause the excitation duration and as a result the size of eventwindow are proportional to human body ‘thickness, whichappears to be the same for the two scenarios. The second issueis related to speed measurement. Different approaches from theliterature, discussed below are limited in their applicability dueto the following key features of the sensor response:• Signal strength at the output of the sensor is not only afunction of distance but also speed of the moving object.For instance a relatively slow moving object at the samedistance will produce a weaker signal compared to anobject moving at a higher speed. This is can be seenfrom our experimental results shown in Fig. 1. The resultsin Fig. 1 also show the effect of speed on the spectralcharacteristics of the output signal.• The other key aspect of the sensor signal response is theeffect of the distance between the sensor and the movingobject. The change in distance not only affects the signalstrength, but also the spectral characteristics of the sensorresponse. This can be seen from the experimental results5000in Fig. 2, where a change in distance from 1 m to 2 mresults in a frequency change from 1.2 Hz to 0.55 Hzcorresponding to the strongest frequency component.5000Slow Medium Fast−500−500−5000 5 0 5 0 5Time (sec)Time (sec)Time (sec)100Slow80604020f =0.65 Hz0−10 0 10f (Hz)100Medium80604020f =1.2 Hz0−10 0 10f (Hz)500010080604020Fastf =2.1 Hz0−10 0 10f (Hz)Fig. 1. Experimental results for three different speeds and the correspondingspectrum at a fixed distance form the PIR sensor.500050001 m1.5 m2 m−5000 5−500−5000 5 0 5Time (sec)Time (sec)Time (sec)6040200−5 0 5f (Hz)40200−5 0 5f (Hz)5001 m 1.5 m 2 mf =1.2 Hz6060f =0.75 Hz04020f =0.55 Hz0−5 0 5f (Hz)Fig. 2. Spectral characteristics as a function of varying distance between themoving human object and the PIR sensor at a fixed speed.A. Direction of MotionA specialized lens arrangement is used in [11] for determiningthe direction of motion. Specifically, the authors reducedthe Fresnel lens horizontal span to a minimum, and choose151


a two element PIR sensor, to obtain a phase shift of 180 ◦ inthe sensor response for the opposite direction. This approachis limited, since using a different lens arrangement or a PIRsensor with an arbitrary number of elements may not givethe same response. A multimodal system using a dual PIRsensor node for direction of motion detection using the sensoractivation sequence is presented in [7] [10]. Our approach tothe problem of direction of motion is somewhat similar to theone in [7], but we measure the phase delay in the responsesfrom the two PIR sensors. The phase delay not only providesan accurate direction detection but also helps in estimating thespeed of the moving object.B. Human CountingAn automated people counting system using low resolutioncameras along with a thermal imagery sensor is discussed in[12]. The two imaging systems complement each other incounting people for the low and high density cases. A PIRbased direction of motion detection as well as counting ofhumans, using a specialized Fresnel lens is proposed in [13].Three physically distributed sensor nodes along the hallwayare used for counting people. Two different cases of peoplewalking in line and walking side by side are considered andsame direction of motion, for all the objects in the group, isassumed. An accuracy of 75% is claimed for the case whenmultiple persons are walking side by side.C. Speed MeasurementThe authors in [14] have used the frequency variationsas a raw indicator of speed. Twenty repetitive independentback and forth walks are performed for three different speedsnamely fast, moderate and slow, along a fixed-path (hence atsame distance from the sensor). The authors do not considerthe variation in the spectral characteristics as a result ofvarying the gap between the sensor and the walking person.As we will observe from the empirical results, there is aconsiderable difference in spectral characteristics due to thevarying distance. Hence spectral variations alone can not beused as a measure of speed and it is necessary to take intoaccount the effect of distance.Another approach used for vehicular traffic speed measurementemploying PIR sensors is discussed in [15]. Theproposed method is based on measuring the time, the vehicletakes to traverse a fixed distance, between the footprints of theFOVs of the two sensors on the roadway. Consider an objectmoving at constant speed v and being detected by a PIR sensorfor the time interval t. If there are two sensors placed close toeach other, such that the midpoints of their FOVs are separatedby a distance d, as shown in Fig. 3, thend =∫ t2t 1vdt, (1)where t 1 and t 2 correspond to the time instances when themoving object reaches the sensor FOV midpoints correspondingto the center of the event window. The assumption hereis that the object is moving in a narrow pathway (of widthc as depicted in Fig. 3) to approximate the distance d as aconstant. This results in two sensors producing approximatelysimilar output regardless of how the moving object approachesthe detector. For this fixed value of d as depicted in Fig. 3,the expression in (1) can be rewritten asv =dt 2 − t 1(2)which is used in [14] for speed measurement. The result in(2) can be used to estimate the speed of moving objects onlyfor constant d. This result will not be valid for human activitymonitoring, where the distance between the sensor and themoving object changes considerably.D. Distance MeasurementDistance estimation using two sensor nodes is discussedin [13], where the wireless sensor nodes are installed on theopposite sides of the hallway. They use two different features,the relative amplitude and signal duration from two differentsensors, for distance estimation. The results in [13] showthat only region based approximate distance classification ispossible using this arrangement.FOVd 2dd 1Dual PIRSensorFig. 3. Physical arrangement of two PIR sensors and their FOV. To limit theerror due to relative proximity of the human object to the sensor we assumethat c/l ≪ 1 leading to d ≈ d 1 and d ≈ d 2 .III. DATA ACQUISITION AND PROCESSINGThe data is either sampled directly or amplified beforesampling depending on the signal magnitude at the sensoroutput. Digital potentiometers are used for dynamic amplifiergain control to improve performance range. We have usedperiodic sampling at a rate of 0.1 kHz for data sampling fromtwo PIR sensors simultaneously. The choice of the samplingrate is to cover a wide range of pedestrian walking speeds.The experiments are performed indoors under bright lightconditions.Data from two PIR sensors mounted on a single node, fordistance and speed variation of a single moving object, isanalyzed for the spectral characteristics. Zero padding wasused to improve the resolution of our small size data set.The frequency components corresponding to peak amplitude atdifferent distances and moving speeds are shown in Fig. 4 andFig. 5 respectively. As can be seen from the results, an increaselc152


in the distance results in a decrease in the frequency of thestrongest spectral component. On the other hand, increasingspeed leads to an increase in the frequency as expected.A. Event Window CalculationTo facilitate human detection and motion tracking, thefollowing data processing is proposed. The duration of eachsensor excitation, including the start and end times, shouldfirst be found. Then, the number of people as well as theirdirection of passage through the sensor node viewpoint at agiven time interval can be deduced. The RMS values of sensoroutputs at event windows are recorded in an attempt to observeits relationship with distance l from the sensor as well as thespeed of the moving object. Correlation analysis of delayedsensor outputs is employed to calculate the relative phase delayof both signals, one of the parameters that relates to the speedof object passage.1.2f (Hz)1.81.61.41.210.80.60.41024 Point FFT2048 Point FFT4096 Point FFT0.20 5 10 15Approximate Human Speed, v (km/h)Fig. 5. Variation of spectral characteristics as a function of human speed at afixed distance of 2.8 m. Experiments for speeds ranging from ”slow walking”to ”running” are performed.1.11SampledPIR data+_SignalRectificationf (Hz)0.90.8DataMeanLowpassFilter 10.70.6SignalQuantizationLowpassFilter 20.51 1.5 2 2.5 3 3.5 4 4.5Distance, l (m)Sensor dataprocessing algorithmEvent windowcalculationFig. 4. Spectral characteristics as a function of distance variation betweenthe moving human object and PIR sensor for an approximate fixed speed of5 km/h.The general steps used to find the duration of sensorexcitation in the form of event window w is illustrated inFig. 6. As can be seen from Fig. 6, the first low-pass filteris responsible for removing the background noise inherited inthe sensor signals and can be different for indoor and outdoorsituations. Currently, a third-order Butterworth filter with acut-off frequency of 5 Hz is employed. The filtered and fullwaverectified signal is then quantized prior to the applicationto second low-pass filtering. Each individual temporal sensorexcitation is segmented by the second first-order Butterworthlow-pass filtering with a 0.5 Hz cut-off, which creates an‘enclosure’ envelope for each excitation. Finally, a gradientsearch on the binary signal is performed in each enclosure todetect the event window start and end times, and hence theduration of sensor excitation.CrossCorrelationCalculationPhase delay,CorrelationcoefficientFig. 6. Block diagram of data processing from single PIR sensor sampledat 10 msec.During testing, a minimum distance of l (Fig. 3), currentlyset at 2 m, is used to prevent saturated sensor excitation. Also,it is found that the absolute mean sensor outputs provides moreaccurate timing information of node excitation compared tozero-mean outputs. This is because a more effective low-passfiltering is possible for non-negative signals compared to theones with fluctuations above and below the mean value.B. Human CountingDue to the incorporation of two PIR sensors in a sensornode, the approaching direction of a human with respect to153


the sensor node can readily be checked by comparing the signof the phase delay between the two sensor outputs. The phasedelay can be readily computed as120010001.932.953.32.32.63 3.023.64 23.17 3.424.1 2.432.92 3.13.66 2.272.22 2.533.01 1.742.42 2.583.11 1.841.882.532.171.73φ delay = arg maxφ [C(y 1(t s1 , t e1 ),−y 2 (t s2 + φ, t e2 + φ))] (3)where φ is the relative phase delay in the output from sensor2 with respect to the sensor 1 output, C denotes the crosscorrelationbetween two signals, y(a, b) represents the sensoroutput within interval a and b while t s and t e are the startand end times of an event window, respectively. Note thatthe negative polarity in y 2 is to account for the physicalarrangement of the two PIR sensors, which are mounted 180 ◦phase shifted on the sensor board. Taking this into accountwould produce a higher average C max than performing crosscorrelationanalysis for the two sensor outputs with the samepolarity. The magnitude of the phase delay is also related to thespeed v of human motion, while the magnitude of maximumcorrelation,C max = C(y 1 (t s1 , t e1 ), −y 2 (t s2 + φ delay , t e2 + φ delay )) (4)indicates the accuracy of phase delay matching. A value ofC max approaching unity suggests near perfect matching. Inpractice, a maximum absolute phase delay threshold |φ max |should be included in the data processing to prevent matchingwith sensor excitations from the previous or the followingevent windows.C. Speed and Distance MeasurementSince the speed and distance affect the signal amplitude aswell as frequency, it becomes a non-trivial task to measureboth parameters simultaneously. One possible approach is toemploy multiple sensor nodes and combined their relativeposition information to estimate these parameters. Alternatively,we can fix one of the parameters to estimate the other,although this leads to a solution with limited practicality.Another possible solution is to consider using multiple sensormodalities to resolve for one parameter. For instance using anultrasonic sensor we can estimate the distance fairly accurately[9], which can be used to extract the speed from frequency.IV. EXPERIMENTAL RESULTSThis Section presents some preliminary results on humanactivity monitoring. Fig. 7(a) shows the raw output of one ofthe sensors and the event window duration (marked as verticalsolid lines with their equivalent numerical values shown inblack above the event window) corresponding to the variationin the distance. The numerical values in grey represent theenvelop of the filtered signal. The result in Fig. 7(b) showsthe results for the variation in speed. As can be observedfrom Fig. 7(a) that event window duration varies between3.17s and 1.73s for different distances while for the case ofspeed variation the event window varies between 4.68s and1s. By minimizing the event window variation for the caseof different distances, it can be used for differentiating the10−bit Pir Output10−bit Pir Output800600400200Original SignalThresholds00 10 20 30 40 50 60 70 80 90Time (s)120010008006004002004.685.384.463.663.744.48Original SignalThresholds(a)4.113.152.913.83.352.342.713.393.222.241.682.651 1.032.04 2.133.09 2.46 2.612.14 1.72 1.600 20 40 60 80 100Time (s)(b)Fig. 7. The raw sensor signal for a moving object and the resulting windowedoutput for a given threshold level for (a) increasing distance at an approximatespeed of 5 km/h, (b) increasing speed at a fixed distance of 2.8 m. The rawsignal is obtained from a 10-bit ADC with thresholds at 10 and 1000 to avoidsaturation.signal spectral changes due to varying speed from those dueto varying distance.The data from two sensors is processed and the maximumcross correlation is computed in order to obtain the phasedelay. Table I gives the phase delay corresponding to differentdistances. Since the person was walking back and forth infront of the two PIR sensors, the resulting phase delay hascorresponding sign reversals to show the direction of motion.Table I also gives the cross-correlation coefficients at differentdistances. This approach for determining the direction of motionis more robust and generalized compared to the polaritybased special case. An increase in the phase delay due toincreasing distance is because of larger separation between154


TABLE ICROSS-CORRELATION BASED PHASE DELAY VARIATION AS A FUNCTIONOF DISTANCE FOR v = 5 km/h.Distance (m) Phase Delay Cross-correlation Coefficient1.3 0.5100 0.70121.3 -0.4300 0.83411.8 0.7700 0.89801.8 -0.6300 0.86942.3 0.8300 0.92242.3 -0.8300 0.86082.8 1.0600 0.87302.8 -1.0000 0.66673.3 1.1200 0.79273.3 -1.0100 0.65373.8 1.3600 0.66773.8 -1.0900 0.5324Normalized Event Windowed RMS1.110.90.80.70.60.50.40.30.2Direction: sensor 1 to 2Direction: sensor 2 to 1TABLE IICROSS-CORRELATION BASED PHASE DELAY VARIATION AS A FUNCTIONOF SPEED FOR l = 2.8 m.Speed (Kmph) Phase Delay Cross-correlation Coefficient1 2.6900 0.71161 -2.0900 0.78782 2.0500 0.82172 -1.6200 0.79953 1.1400 0.83413 -1.0800 0.80675 1.1000 0.86355 -0.9800 0.817110 0.6400 0.913010 -0.6200 0.837215 0.3700 0.839915 -0.4200 0.7909the mid points of the FOVs of the two PIR sensors. The phasedelay variation for different speeds and a single object movingback and forth, is provided in Table II. As expected, the phasedelay corresponding to higher speed of motion is small, i.e.0.37 and −0.42 in contrast to the phase delays of 2.69 and−2.09 corresponding to very slow speed.Using the event window duration and the direction ofmotion, the objective of counting people can be achieved forthe cases: 1) a single person enters or exits at an entrance: 2)multiple people enter or exit in a queue. In the case, wheremultiple people walking in a queue are close to each other,the sensor excitations do not have any region of inactivityseparating the sensor excitations. However, the knowledge ofphase delay can be used to get a rough estimate of the speed,which along with event window duration provides the countof the people present in the queue.We have also studied the effect of distance and speedvariations on the received signal strength. Root mean square(RMS) is used as a measure of received signal strengthvariation. Fig. 8 shows the normalized RMS correspondingto its event window duration as a function of distance. Theresults show closeness of the RMS values for two oppositedirections of motion. The RMS variation as a function of thespeed of the moving object is shown in Fig. 9, where the0.11 1.5 2 2.5 3 3.5 4 4.5Distance, l (m)Fig. 8. The Normalized RMS, based on event window duration, as a functionof distance variation for v = 5 km/h.Nomalized Event Windowed RMS1.110.90.80.70.60.50.4Direction: sensor 1 to 2Direction: sensor 2 to 10.30 5 10 15Approx. Speed, v (km/h)Fig. 9. The Normalized RMS, based on event window duration, as a functionof speed variation for l = 2.8 m.responses are different for opposite directions of motion. Thisis due to the fact that the two sensors had different backgroundviews.To obtain an estimate for the speed a simple algorithmis not possible, because different parameters varying withspeed are also affected by distance. For instance, as weobserved earlier that the duration of event window changesfrom 3.17s to 1.73s at different distances for fixed speed vcompared to a variation of 4.68s to 1s for the case whenthe speed is changed for a fixed distance l (Fig. 7). As aresult we may not use event window duration for reasonablyaccurate estimate of distance. This implies that the use of theevent window duration along with frequency and amplitude155


parameters may not lead to simple algorithms to determinethe distance parameter accurately. On the other hand, fromFig. 5, signal frequency changes approximately linearly and isa more reliable parameter for speed estimation by fixing thedistance parameter l. Using a curve fit to the speed data inFig. 5 we obtain the linear relationship of f = 0.0867v + 0.3,which can be used to estimate v provided l is fixed. In futurewe plan to use ultrasonic sensor for estimating the distancealong with PIR sensors to develop an improved human activitymonitoring system.V. DISCUSSION AND CONCLUSIONSWe have developed lightweight signal processing algorithmsfor sensor nodes equipped with dual Pyroelectric InfraRed(PIR) sensors to achieve the objective of human activitymonitoring. First the limitations of the existing approaches foractivity monitoring are discussed. Next the spectral characteristicsof the sensor data for varying distance and speed of themoving objects are analyzed. Data from dual PIR sensor nodesis first processed individually to determine the activity windowsize, which is then used to determine direction of motion.Human count for special scenarios can be obtained using thedirection of motion and event window durations. Preliminaryresults of our experimentation show the effectiveness of thesimple algorithms proposed. In future, we intend to extendthe proposed algorithms for estimating the object speed andlocalization using distributed algorithms, involving multiplesensor nodes with collaborative sensing, while achieving a realtime implementation.[8] Q. Hao, D. Brady, B. Guenther, J. Burchett, M. Shankar, and S. Feller,“Human tracking with wireless distributed pyroelectric sensors,” IEEESensors Journal, vol. 6, no. 6, pp. 1683–1696, 2006.[9] D. Kim, J. Choi, M. Lim, and S. Park, “Distance correction systemfor localization based on linear regression and smoothing in ambientintelligence display,” in <strong>Proceedings</strong> of Annual International Conferenceof the IEEE Engineering in Medicine and Biology Society, 2008, pp.1443–1446.[10] R. Cucchiara, A. Prati, R. Vezzani, L. Benini, E. Farella, and P. Zappi,“Using a wireless sensor network to enhance video surveillance,”Journal of Ubiquitous Computing Intelligence, vol. 1, pp. 1–11, 2006.[11] P. Zappi, E. Farella, and L. Benini, “Enhancing the spatial resolutionof presence detection in a PIR based wireless surveillance network,” inIEEE Conference on Advanced Video and Signal Based Surveillance,2007, pp. 295–300.[12] I. Amin, A. Taylor, F. Junejo, A. Al-Habaibeh, and R. Parkin, “Automatedpeople-counting by using low-resolution infrared and visualcameras,” Measurement, vol. 41, no. 6, pp. 589–599, 2008.[13] P. Zappi, E. Farella, and L. Benini, “Pyroelectric InfraRed sensors baseddistance estimation,” in IEEE Sensors, 2008, pp. 716–719.[14] J. Fang, Q. Hao, D. Brady, B. Guenther, and K. Hsu, “A pyroelectricinfrared biometric system for real-time walker recognition by useof a maximum likelihood principal components estimation (MLPCE)method,” Optics Express, vol. 15, no. 6, pp. 3271–3284, 2007.[15] T. Hussain, A. Baig, T. Saadawi, and S. Ahmed, “Infrared pyroelectricsensor for detection of vehicular traffic using digital signal processingtechniques,” IEEE transactions on vehicular technology, vol. 44, no. 3,pp. 683–689, 1995.ACKNOWLEDGMENTResearch presented in this paper was funded by a StrategicResearch Cluster grant (07/SRC/I1168) by Science FoundationIreland under the National Development Plan. The authorsgratefully acknowledge this support.REFERENCES[1] Q. Cai, J. Aggarwal, R. Inc, and W. Seattle, “Tracking human motionin structured environments using adistributed-camera system,” IEEETransactions on Pattern Analysis and Machine Intelligence, vol. 21,no. 11, pp. 1241–1247, 1999.[2] P. Muralt, “Micromachined infrared detectors based on pyroelectric thinfilms,” Reports on Progress in Physics, vol. 64, no. 10, p. 1339, 2001.[3] C. Tsai and M. Young, “Pyroelectric infrared sensor-based thermometerfor monitoring indoor objects,” Review of Scientific Instruments, vol. 74,p. 5267, 2003.[4] J. Fang, Q. Hao, D. Brady, M. Shankar, B. Guenther, N. Pitsianis,and K. Hsu, “Path-dependent human identification using a pyroelectricinfrared sensor and Fresnel lens arrays,” Optics Express, vol. 14, no. 2,pp. 609–624, 2006.[5] D. Karuppiah, P. Deegan, E. Araujo, Y. Yang, G. Holness, Z. Zhu,B. Lerner, R. Grupen, and E. Riseman, “Software mode changes forcontinuous motion tracking,” Lecture notes in computer science, pp.161–180, 2001.[6] A. Arora, R. Ramnath, E. Ertin, P. Sinha, S. Bapat, V. Naik, V. Kulathumani,H. Zhang, H. Cao, M. Sridharan et al., “Exscal: Elementsof an extreme scale wireless sensor network,” in 11th IEEE InternationalConference on Embedded and Real-Time Computing Systems andApplications, 2005, pp. 102–108.[7] A. Prati, R. Vezzani, L. Benini, E. Farella, and P. Zappi, “An integratedmulti-modal sensor network for video surveillance,” in <strong>Proceedings</strong> ofthe third ACM international workshop on Video Surveillance & sensornetworks, 2005, pp. 95–102.156


Wavelength modulated off-axis integrated cavity system for trace H 2 SmeasurementsHaitao Gu 1,2 , Dahai Yu 3 , Xia Li 3 , Xiumin Gao 2 , Wei Huang 3 , Stephen Daniels 1 , Jian Wang 31. Dublin City University, Dublin 9, Irland2. School of Electronics and Information, Hangzhou Dianzi University, Hangzhou 3100183. Focused Photonics (Hangzhou) Inc. Hangzhou 310052E-mail: jian_wang@fpi-inc.comAbstractTrace level measurements of gases play importantroles in many fields, especially in industrial processcontrol for purpose of optimizing the process, reducingthe energy consumption, improving the productionefficiency and safety. In this article as example, 0-20ppm trace H 2 S in natural gas has been measured bymeans of off-axis integrated cavity output spectroscopy(OA-ICOS) technology. The off-axis alignmentgeometry was used to eliminate the problem of modematching between laser and optical resonant cavity.And in order to gain higher measurement resolution,the wavelength modulation technique was alsoemployed in the system. Experiments demonstratedthat the repeatability of instrument in H 2 Smeasurement was 0.30ppm ( 3 ) and the linearity wasless than 3% F.S., which met the requirement of traceH 2 S measurements in petrochemical or natural gasindustry.Keyword: OA-ICOS, DLAS, wavelength modulation,H 2 S1. IntroductionAccurate measurement for trace H 2 S is muchsignificant to petrochemical process analysis, naturalgas storage and transportation security, environmentalmonitoring and health care. For instance, purificationplant, gas transport stations and CNG (compressednatural gas) stations need to measure the trace H 2 S inrange of 0-20ppm in order to prevent corrosion ofpipelines. Traditional methods for H 2 S measurementinclude lead acetate strips, gas chromatography andelectrochemical technology. But the disadvantages ofthese methods are high drift, slow response, etc. Inaddition, lead acetate strips and gas chromatographywill consume a large amount of materials and highmaintaining costs. Diode laser absorption spectroscopy(DLAS) can solve above problems, however, DLAScan not detect such a low concentration of H 2 S [1].Various methods and techniques based on DLAShave been developed, which include direct absorptionspectroscopy[2], wavelength modulation (WM)spectroscopy and frequency modulation (FM)spectroscopy[3], photo-acoustic spectroscopy[4],cavity ring-down spectroscopy (CRDS)[5], cavityenhanced absorption spectroscopy (CEAS)[6] andintegrated cavity output spectroscopy (ICOS)[7], etc.They all show different sensitivity and accuracy intrace gas measurement. In 2001, J. B. Paul[8] proposedthe off-axis integrated cavity output spectroscopy,which had simple structure and it was a greatimprovement for high finesse optical cavityspectroscopy.Combining the advantages of OA-ICOS and WM-DLAS, this paper has developed a gas analysisinstrument with simple structure. This instrument usesdiode laser and photoelectric detector withoutpiezoelectric transducer, chopper, APD, and othercomplex and expensive devices which makes theinstrument more stable and cheaper. Meanwhile, theinstrument uses wavelength modulation methods,which compress the noise and achieve the higherdetection sensitivity. In this article, the trace H 2 S inrange of 0-20ppm is accurately and sensitivelymeasured using OA-ICOS in near infrared wavelength.2. OA-ICOS principle2.1. The cavity mode matchingFrom the Fabry-Perot theory, the stability conditionof optical resonator defined by the inequality:157


0 (1 dR1)(1 dR2) 1[1]where d is the cavity length, R1and R2are themirror curvatures. In the resonant cavity, the light beamcan reflect multiple times between high reflectivitymirrors for tens of thousands of times. However, theinterference of cavity resonances arises because of theperiodic boundary condition imposed by the mirrorsurfaces. Only the light which frequency matches theresonance transmission spectrum can pass through thecavity and the other beams will be compressed. In thecase of coaxial, the FSR (free spectral range) of thecavity is directly determined by the cavity length. In thecoaxial cavity ring-down cavity and coaxial ICOS,piezoelectric transducer is often used to change thecavity length, so that the single-frequency laser can belocked to a separate single resonance mode of thecavity. It is therefore complex and expensive.In the 1960s, Herriott and others first investigatedthe off-axis coupling into the optical cavity [9, 10] andthe OA-ICOS was developed based on Herriott’s offaxisaligning[8]. In the off-axis aligning case, the laserbeam makes multiple reflections within the cavitybefore two light spots overlap on one mirror surface.The resonance distance increases remarkably, and theFSR therefore reduces. If the number of optical roundtrippasses is much enough and FSR will small enough,and the cavity transmission will be averaged versusfrequency. That means the light beam can pass throughthe cavity without mode matching between singlefrequency laser and resonance mode of cavity. Forexample, 0.5 m length cavity has the FSR of 300MHz,if the beam passes 100 round trips, the cavity resonancedistance will be 200 times of cavity length, the FSR atthis time reduced to only 1.5MHz, but the line-width ofdiode laser used in spectroscopy is about 10 ~100MHz, and the FWHM of the gas absorption lines innear-infrared wavelength is in the 1GHz magnitude. Inthis paper, the number of optical round-trip passes hasbeen increased to 300 times. The mode matchingproblem has been solved successfully.2.2. Absorption enhancementIn the OA-ICOS, light transmission also follows theBeer-Lambert Law: ( v)L C( v)LI I 0e I 0e[2]where I0is the intensity of incident light, (v)is theabsorbance in unit length, C is the concentration ofgas, is absorption cross-section, L is the effectiveoptical path length, v is the center frequency ofabsorption line.Laser beam goes through the gas in cavity manytimes and the effective optical path length increasesremarkably. In OA-ICOS, effective optical path lengthis related to the reflectivity of the cavity mirrors, cavitylength and other parameters, as follows [11]:2 ( v)d1(1 R ) eL ln[](v)2 2( v)d1R e[3]where R is the reflectivity, d is cavity length. 1If R 99.97%, v 6 5 108 ( ) . cm (equivalent to10ppm H 2 S gas absorption in the 1cmlength), d 0.5m , the effective optical path length willbe L 1666m , which increased of 3333 times ofcavity length. Compare to the traditional single pathlength instrument, absorption has been increasedremarkable and much lower minimum detection limit isachieved.2.3. WM-DLASOA-ICOS is combined with the high sensitivityWM-DLAS technology. The diode laser is tuned by amodulation signal, which combines sinusoidal wavesignal and triangle wave signal. Laser wavelengthtuned by the triangle wave signal scans the absorptionline of measured gas. And the wavelength is modulatedby sinusoidal wave signal simultaneously. Thetransmission signals are received by photo-electronicdetector and passed through the high-pass filter, then aphase-sensitive detector “locks in” and amplifies thesecond harmonic signals. The signals are proportionalto the concentration of gas [12], as follows:S2f I0H 2 ( v,a)[4]S(T ) PCL I0{g[v v0 acos(u)] cos(2u)du} where S(T) is the absorption intensity of theabsorption lines, the linear functiong[ v v0 acos(u)]represents the shape of theabsorption lines, P is the pressure. At the same time,the photo-electronic detector signals pass through thelow-pass filter, so we can get DC components:DC I 0[5]From equation 4 and 5, the gas concentration can besolved:S2f/ DCC KPLS(T ) B(P,T )[6]158


πwhere B(P,T) g[ v v0 acos(u)]cos(2u)duis πthe compensation matrix which is related to the shapeof absorption lines, K is calibration factor, and b0iszero factor.3. ExperimentalThe experimental setup employed in the presentwork is depicted in figure 1. Highly reflective 25.4 mmdiameter concave mirrors (1 m radius of curvature)separated by a 50 cm stainless steel spacer formed theoptical OA-ICOS cavity. The reflection coefficient forthe ultra-low-loss mirror was estimated to be R ≥9995% using a CDRS approach. After focused by asmall aperture convex lens, the laser beam wascollimated. Collimated light was injected into thecavity through one mirror. And the angle between lightand cavity axis was about 1º. At the other side of thecavity, a convex lens was used to collect emissionbeam from cavity, and a photoelectric detector wasplaced at the focal point of the lens. After processed bycircuit, the DC signal and second harmonic signals( S 2f ) were derived from the signals received byphotoelectric detector. Then the gas concentration canbe solved by equation 6.H 2 S in 1 cm path length is 6.5 × 10 -5 cm -1 . Thefollowing diagram is the absorption spectrum of H 2 S innear infrared region:Absorbance1.2x10 -41.0x10 -48.0x10 -56.0x10 -54.0x10 -52.0x10 -50.0H2S 1%-1cm at 298k 1barfrom PNNL database6100 6200 6300 6400 6500Wavenumber cm -1 Figure 2. Absorption spectrum of H 2 S in1% concentration and 1 cm pathlength under normal pressure androom temperature.4. Results and discussion4.1. LinearityThe full measurement range of the H 2 S was 20ppm.after zero calibration, 35% F.S. (7ppm), 50% F.S.(10ppm), and 70% F.S. (14ppm) of standard gas wereseparately flowed through cavity. Second harmonicsignals are showed as following figure 3, as well as thefigure which shows the relationship between signalintensity and the concentration in the upper right cornerof figure 3. The linearity is less than 3% F.S. .0.970.960.080.070.060.05Figure 1. The schematic diagram of theinstrumentA DFB laser was used as light source operating inthe near infrared. Frequency turning of the single modediode laser can be carried out by the current (over morethan 1 cm −1 ). Laser modulation signals were generatedby a function generator. The frequency of sinusoidalwave was 15 kHz, frequency of triangle wave was100Hz, and triangle-wave range determined the scanrange of laser wavelength.In this article, the measured absorption line of H 2 Swas in 1.55µm wavelength region[1]. Under the normalpressure and room temperature, the absorbance of 1%2f s i gnal i nt ens i t y0.950.940.930.920.910.900.890 ppm7 ppm10 ppm14 ppm20 ppm0 5 10 15 200 40 80 120 160 200 240Poi nt number0.040.030.020.010.00Li near ac c ur ac y ≤ 3% F. S.Figure 3. Linearity of the instrument in H 2 Smeasurement4.2. RepeatabilityWith different concentrations of H 2 S standard gas,the measurement results are shows in the figure 4. Thepeak-to-peak deviations of output with constant inputcan be calculated from figure 4. The measurement159


epeatability is normally defined as 3 times ofstatistical standard deviation, 3 0.30ppm .2520Conc ent r at i on [ ppm]1510530.303ppm0500 1000 1500 2000 2500 3000Meas ur e t i me [ s ec ond]Figure 4. The measurement repeatability of theinstrument in H 2 S measurement4.3. H 2 S measurements in natural gasWhen measuring natural gas, methane absorptionlines are common. Methane has several overtone andcombination absorption bands in the near-infraredwavelength range. At a wavelength of 1.56~1.63μmH 2 S has strong absorption lines, but methane is stillabsorbing light at a low level. The situation is fairlysimilar in the case of other hydrocarbons, too. Due tothe lack of rotationally resolved absorption lines thesemolecules have broad background-like absorption,which varies slowly within the wavelength-tuning rangeof the diode laser. It became obvious that reliablemeasurement of H 2 S in nature gas was only possible ifa reference gas that had the H 2 S removed was used todetermine and subtract the methane content from theH 2 S measurement. Figure 5. shows the spectralresponse of the H 2 S analyser using this differentialmethod. Figure 6. illustrates the linearity of the system.The reference gas was generated by letting the naturalgas flow through an absorbing cell that containedFe 2 O 3 and FeO as a H 2 S scrubber.2f signal [V]2.52.01.51.00.150.100.050.00-0.05-0.1025ppm H 2S in 50%CH 4and 2.5%CO 2background gaswithout H 2S scrubberwith H 2S scrubberDi f f er ent i al 2f s i gnal25ppm H 2S0 50 100 150 200Poi nt numberFigure 5. Spectral response to H 2 S using adifferential methodFigure 6. The linearity of 0~10ppm H 2 S5. ConclusionIn this article, we have measured trace H 2 S by acombination of OA-ICOS and WM-DLAS. Laser beamis injected into the high finesse resonant cavity by offaxisaligning methods. It resolves the mode matchingproblem between laser frequency and mode of resonantcavity, and simplifies the structure of instrument. Bycombined with WM-DLAS technology, it improves thedetection sensitivity of the instrument. By making lightbeam multiple reflect in high finesse resonant cavity, itincreases the effective optical path length significantly,so that lower minimum detection limit. Experimentsdemonstrate that the measurement repeatability ofinstrument is 0.303ppm ( 3 ), linearity is less than ±3% in range of 0-20ppm. The instrument meets thetrace H 2 S measurement requirement in petrochemicaland natural gas industry.Acknowledgement: This work was supported byNational Natural Science Foundation of China(50574035) and National 863 Project of China(2006AA040310, 2007AA04Z196).10. References[1] Weldon, V., Gorman, J.O’, Phelan, P., Hegarty, J.,Tanbun-Ek, T., "H 2 S and CO 2 gas sensing using DFB laserdiodes emitting at 1.57µm," Sensors and Actuators B.,29,1995, pp.101-107.[2] Allen, M.G., "Diode laser absorption sensors for gasdynamicand combustion flows," Meas. Sci. Technol., 9,1998, pp.545-562.[3] Bomse, D.S., Stanton, A.C., Silver, J.A., "Frequencymodulation and wavelength modulation spectroscopiescomparison of experimental methods using a lead salt diode,"Appl. Opt., 31, 1992, pp. 718-731.[4] Varga, A., Bozoki, Z., Szakall, M., Szabo, G.,"Photoacoustic system for on-line process monitoring ofhydrogen sulfide (H 2 S) concentration in natural gas streams,"Appl. Phys. B, 85, 2006, pp. 315-321.160


[5] Keefe, A.O’, Deacon, D.A.G., "Cavity ring-down opticalspectrometer for absorption measurement using pulsed lasersources," Rev. Sci. Instrum., 59(12), 1988, pp. 2544-2551.[6] Engeln, R., Berden, G., Peeters, R., Meijer, G., "Cavityenhanced absorption and cavity enhanced magnetic rotationspectroscopy," Rev. Sci. Instrum., 69, 1998, pp. 3763-3769.[7] Keefe, A.O’, "Integrated cavity output analysis of ultraweakabsorption," Chem. Phys. Lett., 293, 1998, pp. 331-336.[8] Paul, J.B., Lapson, L., Anderson, J.G., "Ultra sensitiveabsorption spectroscopy with a high-finesse optical cavityand off-axis alignment," Appl. Opt., 40(27), 2001, 4904-4910.[9] Herriott, D.R., Kogelnik, H., Kompfner, R., "Off-axispaths in spherical mirror interferometers," Appl. Opt., 3,1964, pp. 523-526.[10] Herriott, D.R., Schulte, H.J., "Folded optical delaylines," Appl. Opt., 4,1965, pp. 883-889.[11] Fiedler, S.E., Hese, A., Ruth, A.A., "Incoherent broadband cavity-enhanced absorption spectroscopy," Chem. Phys.Lett., 371, 2003, pp. 284-294.[12] Wang, J., Maiorov, M., Baer, D., et al., "In situcombustion measurements of CO with diode-laser absorptionnear 2.3um," Appl. Optics., 39(30), 2000, pp. 5579-5589.161


Integrated air quality monitoring: applications of geosensor networksJer Hayes ¥ , King-Tong Lau*, Roberty J. McCarhty ¥ , Dermot Diamond**Clarity – centre for sensor web technologies, Dublin City University, Glasnevin, Dublin 9,Ireland¥IBM, Innovative environmental solutions, Mulhuddart, Dublin 15, IrelandE-mail: hayesjer@ie.ibm.com; kim.lau@dcu.ie; rjmccarthy@ie.ibm.com;dermot.diamond@dcu.ieAbstractWorldwide environmental monitoring has becomeevermore important due to the increasing pollutioncaused by human activities and also as a result ofclimate change. The measurement of a widespectrum of environmental parameters is very labourintensive and costly using the current conventionalsampling methods whereby samples are normallycollected and transported to laboratory for analysisor hand held devices are used for on-the-spotmeasurements. These approaches are expensive andresult in very low measurement frequency. Toincrease temporal and spatial resolution, geosensornetworks using various techniques including wirelessautonomous sensing, low cost sensor networks andinformation extraction from web available satelliteremote sensing data are required. We describe anddemonstrate on-going work on using remote sensingto monitor atmospheric NO 2 levels in Ireland andnovel in-situ gas sensing. .1. IntroductionHuman activities are the main source of pollutionin modern day society. Agricultural wastes leached outfrom farm land, toxic wastes emitted from variousindustries and commuter cars that are pumping outtoxic chemicals have numerous impacts in our dailylives. It is changing the weather, affecting our healthand also changes our lifestyle. Modern lifestyle is adouble blade sword – it makes life more comfortablebut also known to contribute greatly to environmentaldestruction and pollution. It was unthinkable forpeople in the eighties of the last century to carry withthem a bottle of water whenever they went out andthat the water could come from many thousand milesaway. Combined with the disposable culture that wascreated by high through put and mass production,driven by consumer market, the level of pollution anddestruction of our environment escalated at a pace theearth has never seen before.Environmental monitoring and pollution controlare becoming urgent tasks in light of the climatechange we experience in recent years. As a result,water resources in some area have become lessavailable whereas in other places, flooding and relatedwater pollutions are occurring in higher frequency andseverity. On the other hands, the ever-increasinglydependency on fossil fuels of our society, contributesto the pollution in air, land and water bodies.Consequently, environmental protection is muchneeded to safeguard our very own existence in thisincreasingly hostile world. Facing such huge scale ofpollution and environmental change that affects allhuman lives, the task of monitoring water, soil andatmosphere parameters is also a global challenge thatall countries are obliged to take considerable effort inmonitoring and controlling the pollution level.Geosensor networks are being deployed which usevarious techniques including wireless autonomoussensing, low cost sensor networks and informationextraction from web available satellite remote sensingdata to increase the temporal and spatial resolution ofenvironmental monitoring. The adaptive sensorsgroup [1] are involved in the development of sensorsand sensor networks for air and water qualitymonitoring (e.g. [2,3]) as part of the Clarity, centre forsensor web technologies [4]. In-situ sensors can giveprecise information for particular points; they do nothave sufficient coverage to give information even onregional level. Remote sensing is used to provide amore complete picture of pollutant dispersion at amuch larger scale. To support in-situ sensing, remotesensing which covers larger spatial areas are required,e.g. satellite-based systems can gather data for thewhole globe. We will describe and demonstrate ongoingwork on NO 2 levels in Ireland using remotesensing data from satellites. This includes developingvisualisation techniques to automatically process andpresent data in a useful manner.162


One example of a remote sensing satellite is theEnvironmental Satellite (ENVISAT) built by theEuropean space agency (ESA). The satellite hosts asuite of instruments which can be used for measuringocean currents and ocean topography; landscapetopography and the presence of snow and ice;measuring ocean colour and biology (e.g. algalblooms); vegetation types; presence of clouds;precipitation; sea surface temperature; andatmospheric chemistry. One instrument on onboardENVISAT is the Scanning Imaging AbsorptionSpectroMeter for Atmospheric CartograpHY(SCIAMACHY) instrument which is a satellitespectrometer designed to measure solar radiationtransmitted, backscattered and reflected from theatmosphere. The data is recorded at relatively highresolution in the ultraviolet, visible and near infraredwavelength region. This system produces valuable rawdata which are made available online in their web site.However, useful information can only be obtainedafter complicated data extraction and processing.There are two basic reasons for usingSCIAMACHY: (1) Pollution does not respect borders– we need to examine the trans-boundary transport ofpollution. (2) There has been a worldwide increase intropospheric green-house gases. Therefore we need todetermine the situation in Ireland with respect to therest of Europe and the World.How do we show that pollution moves acrossborders? Examining the accumulation of NO 2 overlong periods can reveal the location of NO 2 sourcesand also indicate the typical movements of this gas. Inthe context of Europe, Ireland has relatively low NO 2rating (which reflects the low population).It is often a cumbersome task to extract usefulinformation from the vast sea of data collected bysatellite-based instruments. Huge amounts of raw dataare downloaded from which selected data from thegeographical zone of interest are then extracted andclassified. From the processed data, we apply a simplevisualisation technique that allow users to browse theoverall results and to find data of interest (see sectionII). This work is being done to complement work inwireless chemical sensors which is highlighted insection III. A complete picture outdoor air qualitymonitoring requires the integration of in-situ andremote sensing data.2. Visualising dataOnline web based data browsers from SCIAMACHYalready exist, e.g. the SCIAMACHY troposphericDOAS nadir data browser [5]. However, a general,low-resolution visual data are presented which fallsinto a number of views with the one giving the mostcoverage of Ireland being a European-wide view.Although such views are useful, a European-wide viewwill likely show the highest levels of NO 2 over parts ofthe UK and the Benelux countries and are not usefulfor in-depth analysis of individual countries such asIreland.For the purposes of clarity we would like to presentusers with a visualization of location-specific NO 2levels with respect to a particular period such as thatshown in Figure 1, which highlights Ireland and partof UK. Ideally, images of NO 2 levels are madeavailable online where raw data from which theseimages are created can be accessed through thebrowser. Because these images are made up of boxescovering areas where average values of gasconcentration are represented by a colour scheme. Weconsider these images as only an approximaterepresentation of the data; especially when a long timeframe is selected. These images are best used to givean overview of the gas dispersion and raw data areavailable when requested.Figure 1: The NO 2 levels are measured by the SCIAMACHYinstrument on the Envisat satellite. In this example theatmospheric volume directly under the satellite is observed(nadir). Each scan covers an area on the ground of up to 960km across track with a maximum resolution of 26 km x 15 km.In this work we use a collection of applications(including VISAN [6]) and scripts to create theimages similar to that in Fig. 1. The images arerepresented by contour plots of NO 2 vertical columndensities with geo- referencing information that can bedisplayed and interacted through programs such asArcGIS explorer and Google Earth. These programsare preferred to Google Maps (see Fig. 2) as they candisplay more data points. These images areautomatically generated from data made available bythe ESA. The process of creating the contour plotsworks as follows: (1) Satellite data products in N1format are converted to ASCII format using VISAN[5]; (2) the data is smoothed using inverse squares(Cauchy weight function); (3) the contour plots arecreated from this smoothed data (see Fig. 3). Currentlycontour plots are made for monthly data sets. The163


colour coded data that are used to create the data plotsare shown in Fig. 2, where a map-based interface isused to indicate the locations from which the data arecollected.Figure 2: The SCIAMACHY instrument operates in three modesNadir, Limb, and Occultation. For a single day the Nadirmeasurements do not cover all of Ireland (and the UK) and sodata is typically built up over several weeks. For example, onthis day although most of northern Ireland was not covered bythe nadir measurement.Monthly plots are chosen as the SCIAMACHY maypass over parts of Ireland every 1 to 3 days andmeasures in three viewing geometries: Nadir, Limb,and Occultation. In Nadir mode the atmosphericvolume directly under the instrument (i.e. thespacecraft) is observed. In Limb mode the instrumentlooks at the edge of the atmosphere. Using only Nadiror Limb modes global coverage is achieved within 3days (for 960 km swath width). So for any individualday there are blank areas where no data is available(see Fig. 2).3. Chemical sensor networksYearly NO 2 data from SCHIAMACHY over theDublin region in Ireland is presented in Fig. 4.Remote sensing data can provide background levelinformation for parameters of interest over very largeareas. However, the resolution of these sensors meansthat we still need in-situ sensors to provide a moreaccurate local pollution level to complete the picture.Commercial gas sensors generally lack selectivity,require high operation voltage and have high powerconsumption. This demands the development of newlow-powered novel sensing techniques that canmeasure gas concentrations required by theatmospheric pollution control [7]. The adoption ofwireless sensor networks adds another parameter tousing gas sensors; namely that they should berelatively cost effective to ensure affordability.One interesting area is the use of colorimetricsensors whereby a colour change indicates thepresence (and amount) of a chemical species. A simplepolymer based colorimetric sensor can be fabricated bydissolving the pH indicator dye into a polymersolution [8]. This polymer formulation can be coatedonto to the surface of low-cost optical sensorcomponent such as LEDs. Thus a low-cost low-powerwireless sensor networks based on optical sensingsystem can be realized (see Fig. 5). The current focusof WSN research tends to be on hardware,communication protocols and power management andalso on simulation/modelling of these networks.Clearly research has to be carried out on these areas tosolve fundamental issues concerned. However, sensornodes are platforms for hosting (chemical,biochemical) sensors and, as such, consideration mustalso be given to the sensor development andintegration as they provide the vital information on theenvironment.There is still a large gap between the developmentof wireless sensors networks and the development ofchemical sensors as research into both are stillessentially discrete fields despite the growing interestin merging these two disciplines. However, Wirelesschemical sensor networks have been deployed incontrolled environments [3,8] but these could easily bemodified to use in outdoor scenarios. These wirelesschemical sensors networks can be used to complimentthe remote sensing technique to provide a morecomplete picture of the environment.4. ConclusionsTo increase the temporal and spatial resolution ofenvironmental monitoring, Geosensor networks arebeing developed. These networks use varioustechniques including wireless autonomous sensing,low cost sensor networks and information extractionfrom web available satellite remote sensing data tomonitor the environment.We described on-going work on NO 2 levels in Irelandusing data from SCIAMACHY and also outlinedongoing work on wireless chemical sensor networks.Currently the satellite data is being used to provideinformation background levels of various troposphericgases.AcknowledgementsThe authors wish to thank the following for theirsupport: Science Foundation Ireland (SFI07/CE/I1147, “Clarity: centre for sensor webtechnologies”), Barry Fennell of Enterprise Irelandand the European Space Agency.164


5. References[1] The Adaptive Sensors Group,http://www.dcu.ie/chemistry/asg/[2] Christina M. McGraw, Shannon E. Stitzel, John Cleary,Conor Slater, Dermot Diamond, (2007) Autonomousmicrofluidic system for phosphate detection, TalantaVolume71, Issue 3, 28, pages 1180-1185.[3] Roderick Shepherd, Stephen Beirne, King Tong Lau,Brian Corcoran, Dermot Diamond, (2007) Monitoringchemical plumes in an environmental sensing chamber witha wireless chemical sensor network, Sensors and ActuatorsB: ChemicalVolume 121, Issue 1, Special Issue: 25thAnniversary of Sensors and Actuators B: Chemical, 30,Pages 142-149.[4] CLARITY: centre for sensor web technologies,http://www.clarity-centre.org/[5] SCIAMACHY Data Browser, http://www.iup.unibremen.de/doas/scia_data_browser.htm[6] VISAN documentation,http://www.stcorp.nl/beat/documentation/visan.html[7] A. Martin, J.P. Santos, H. Vasquez and J.A. Agapito,Study of interferences of NO2 and CO in solid statecommercial sensors. Sens. Actuators B 58 (1999), pp. 469–473.[8] Jer Hayes, Stephen Beirne, Breda M. Kiernan, ConorSlater, King-Tong Lau and Dermot Diamond (2008).Chemical Species Concentration Measurement via WirelessSensors. In the proceedings of World Academy of Science,Engineering and Technology, Venice, Italy (2008).A.B.Smith, C.D. Jones, and E.F. Roberts, “Article Title”,Journal, Publisher, Location, Date, pp. 1-10.Figure 4: An image of a contour plot of NO2 vertical columndensities with geo-referencing information that can be displayedand interacted with through programs such as ArcGIS explorerand Google Earth. These images are automatically generatedfrom data made available by the European Space Agency.4E+153mm4E+15Vertical Columd density NO23E+153E+152E+152E+15VCD no2Std. dev. no21E+155E+1401 2 3 4 5 6 7 8 9 10 11 12MonthsFigure 3: Monthly averages of NO 2levels over the Dublin area for2007. The data was derived from SCIAMACHY data productsprovided by the European Space Agency. Remote sensing datacan provide background level information for parameters ofinterest over very large areas. However, the resolution of thesesensors means that we still need in-situ sensors to provide acomplete picture.Figure 5. An example of LEDbased optical sensor platform.165


Approximate Analysis of Fibre Delay Lines andWavelength Converters in an Optical Burst SwitchDaniele Tafani and Conor McArdleResearch Institute for Networks & Communications Engineering,School of Electronic Engineering, Dublin City University, IrelandE-mail: tafanid@eeng.dcu.ieAbstract—We consider an optical burst switch with and withouttuneable wavelength converters and with varying numbersof fibre delay lines. We propose a virtual flow model of trafficwithin the switch and apply the Equivalent Random Methodto resolve blocking probability analytically. Our emphasis ison approximate models with good numerical efficiency in thesolution. Analytic results are compared to results from discreteeventsimulations.I. INTRODUCTIONIn recent years, Optical Burst Switching (OBS) [1] hasbeen proposed as a possible near-term solution to utilise thecapacity of deployed optical fibre efficiently. In an optical burstswitch, contention occurs when two or more incoming burstsare directed to the same output channel at the same time. Whenthis happens, a contention resolution startegy may be appliedto prevent burst loss. The most common strategies involve theuse of Fibre Delay Lines (FDLs) and Wavelength Converters.There are several existing approaches to performance evaluationof OBS nodes equipped with FDLs or wavelengthconverters. In [2], Callegati evaluates burst blocking probability,modelling a single FDL as a queue with balking.An exact Markov chain analysis of an FDL, for correlatedarrivals, is considered in [3]. An analysis of limited numbersof wavelength converters has been developed in [4] while converterswith a limited conversion range are considered in [5].Gauger [6] investigates the performance of the combination ofwavelength converters and FDL buffers, through simulations.The approach in the current paper is based on resolvinga network of relatively simple queuing systems, representingvirtual traffic flows within the node. This differs from previouswork, which has focused mainly on detailed evaluation ofsingle queueing system models. We apply Equivalent RandomTheory (ERT) [8] to resolve blocking probability of an OBSswitch equipped with both FDLs and wavelength converters.Our work most closely relates to [4] where ERT has been appliedto approximate analysis of shared wavelength convertersin an OBS node without FDLs.of P input/output ports and a bank of K fibre delay lines,which are shared by all ports (Fig. 1). Additionally, we willconsider the cases of (a) full wavelength conversion, wheretuneable converters at the input ports allow a burst on anyincoming wavelength channel to be switched to any outgoingchannel or to any channel in an FDL, where each FDL cancarry up to R burst simultaneously, R ≤ N, and (b) nowavelength conversion, where there are no converters presentand an incoming burst must exit on the same wavelength as ithas arrived or be delayed in one of K single-channel FDLs. Wenext describe the combined behaviour of the output channels,FDLs and wavelength converters, which determines how wemodel the overall switch behaviour.Each FDL unit is a single fibre offering a constant delaytime of D k seconds, k ∈ {0,1,2,...,K}. Delay timesof the units are each a multiple of a base delay time Csuch that D k = kC. A controller coordinates scheduling ofthe output channels, FDLs and wavelength converters. Thecontroller aims to resolve contention between bursts arrivingfrom different input ports that are destined for the sameoutput channel. If wavelength converters are present in theswitch and none of the N output channels is available for theduration of a burst arriving at a time t, an attempt is madeto simultaneously schedule a free FDL channel (with delaylength D k ) and any outgoing channel that will become freeP input fibreports each withN inputwavelengthchannelswith or withoutwavelengthconvertersK delay linereturn portseach with RchannelsOpticalSwitchP output fibreports each withN channelsK delay lineinput portsII. MODEL DESCRIPTIONWe focus on the analysis of burst blocking probability inan optical burst switch having N wavelength channels in eachThis material is based on research supported by Science FoundationIreland (SFI) under the Research Frontiers Programme Grant No.[08/RFP/CMS1402].166Fig. 1.D1=CDK= K.COptical Burst Switch Under Study


P output portsCarriedF IF N Output trafficChannelsEffectiveoffered trafficF to single port F Overflow fromsingle portTraffic Carried Overflow fromby FDL Bank P.F all portsP.FFDL 1F 1FDL 2F 2Bank of F K-1sharedFDLsFDL KF BFig. 2.Traffic blocked from nodeVirtual Flow Model of Switch Output Port with FDLsat time t + D k . For the case of no wavelength conversion, ifthe burst’s wavelength channel is busy, an FDL is sought thatdelays the burst sufficiently until that same wavelength channelbecomes free. In either case, the scheduler first attempts theprocedure using FDL unit 1, offering delay D 1 , and iteratesin sequence through successively longer FDLs until a feasibleschedule is found. If none of the available FDL delay timescan resolve the schedule, then the burst is blocked (lost).A queueing model, that approximates this behaviour, isproposed in (Fig. 2). In the case of full wavelength conversion,the combined action of the output channels in an outputport and the input wavelength converters is modelled as afully accessible blocking system with N channels [9] (anM/M/N/N system). In the case of no wavelength conversion,the output channels are modeled as N independent singleserverblocking systems (M/M/1/1 systems). In either case,we may consider bursts, that would be immediately blockedat an output port if there were no FDLs present, as formingan imaginary overflow traffic that is offered as an input trafficflow to the bank of FDLs. This flow is indicated as ˆF in Fig.2. The combined overflow from all P output ports, offered tothe shared FDL bank, is indicated as P · ˆF.The bank of FDLs is modelled as a sequence of overflowingblocking systems, with overflowing (blocked) traffic ˆF k fromFDL k feeding FDL k + 1. The aggregate traffic carried byall FDLs, P · ¯F, is additional traffic that, having exited theFDL units, must then be carried by the output ports. Weassume, without loss of generality, that loading across outputports is evenly distributed, as are overflow volumes from eachoutput port, so carried traffic from the FDL bank is also evenlydistributed to the P output ports, with each port receiving flow¯F.The combination of this FDL carried traffic and the actualoffered flow to each port, F I , is denoted F, which we callthe effective offered traffic. We assume that the actual offeredtraffic to the port is Poisson in nature and we assume that thefeedback flow ( ¯F) is small in comparison with this. Thus,neglecting any traffic correlations, we assume the effectiveinput traffic F is also Poisson.Traffic which overflows from the FDL bank ( ˆF B ) is lostfrom the system. In the next section, we resolve the meanof the effective input flow F and, from this, the mean of theblocked traffic ˆF B , from which we may calculate the blockingprobability. In developing our queuing analysis, we assumethat the burst durations are exponentially distributed. We notethat the validity of assuming exponentially distributed burstlengths is supported by OBS performance studies in [6] and[7].III. MODEL ANALYSISWe first note that the traffic overflowing from the ports,subsequently offered to the bank of FDLs, is not Poisson.Thus, each FDL could be described by a GI/M/R/R system,where GI denotes independent arrivals of a general distributionand R is the number of channels an FDL may carry.We also note that, alternatively to an individual GI/M/R/Rmodel for each FDL, we may model the combined chain ofK FDLs as a single GI/M/L/L system, where L = K.R isthe total number of FDLs channels in the bank, because theoverflows are renewal streams under the current assumptions.This GI/M/L/L model directly relates the traffic offered tothe FDLs, P · ˆF, to the carried and overflow traffics, P · ¯Fand ˆF B respectively. We treat the case of no wavelengthconversion first (where R = 1 and L = K) and then extendto the case of full wavelength conversion.A. No Wavelength ConversionIn this case, each output channel is modeled as an independentM/M/1/1 system and so an output port corresponds to aset of N independent M/M/1/1 queues. The effective offeredinput traffic to a port, F with mean intensity M, is split into Ninput subflows with the same mean intensity m = M/N, beingoffered to each one of the N independent M/M/1/1 queues.Therefore, the probability of blocking in the output port isequivalent to the probability of having one output channelbusy when the mean intensity of the offered traffic is m. Thisprobability is given by the well known Erlang B formula, ofthe form/E(A,N) = AN ∑ NA kN! k! , (1)k=0where A is the arrival intensity of the offered traffic in unitsof Erlangs and N is the group size (number of output channels)[10]. The arrival intensity A is given as the product of the meanarrival rate λ and the mean channel holding time 1/µ, whereµ is the mean burst transmission rate. In our case, for oneoutput channel the blocking probability given by (1) is simplyE(m,1) = m/(1 + m).We now wish to calculate the mean ( ˆm) and variance (ˆv) ofthe overflow traffic from one of the N independent M/M/1/1systems.167


Virtual Traffic SourceA*f( t)= A* e -A*tEquivalent SystemA*overflowtrafficf( t)= A* e -A*tP. M,P.VMB, VN*MoverflowtrafficBB, VLBN* + LFig. 3. Overflow System and its Equivalent for N Output Channels and LTotal Channels in FDL BankAccording to the Kosten overflow equations [8], we haveˆm = m · E(m,1) (2)()mˆv = ˆm 1 − ˆm + . (3)2 + ˆm − mTherefore, the mean of the overflow traffic from an outputport (denoted ˆM) will be given by ˆM = N · ˆm and, sincewe are assuming independence between the output channels,the overflow variance (denoted ˆV ) is given by ˆV = N · ˆv.Again, assuming independence between output ports, the totaloverflow traffic from all ports is described by P · ˆM, P · ˆV .This traffic is offered traffic to the FDL bank.We note that the overflow traffic is “peaked” (the varianceˆV of the traffic intensity is greater than the mean ˆM) andnot Poisson. To evaluate the overflow mean from the bank ofFDLs we employ Equivalent Random Theory [8]. Particularly,we introduce a virtual equivalent system with a primary groupof N ∗ number of channels being offered Poisson traffic withmean intensity A ∗ (Fig. 3). We then match the overflow of thissystem with the offered traffic to the FDL bank. This meansthat the mean A ∗ of the equivalent Poisson offered traffic andthe equivalent number of channels N ∗ must both satisfy theKosten overflow system of equations,P · ˆM = A∗ · E(A ∗ ,N ∗ )((4))P · ˆV A= P · ˆM 1 − P · ˆM ∗+N ∗ + 1 − A ∗ + P · ˆM.(5)which may be solved for A ∗ as a numerical root findingproblem. We may choose an initial solution for the numericalsolution from Rapp’s approximation [9] for an overflowsystem:A ∗ ≈ P · ˆV + 3Ẑ(Ẑ − 1)N ∗ ≈ A∗ (P · ˆM + Z)P · ˆM + Ẑ − 1 − P · ˆM − 1where Ẑ is the “peakedness” factor P · ˆV /P · ˆM.In this way, we can calculate the mean ( ˆM B ) and thevariance (ˆV B ) of the overflow traffic from the FDLs bankemploying the Brockmeyer overflow system of equations,ˆM B = A ∗ · E(A ∗ ,N ∗ + L) (6))ˆV B = ˆM B(1 − ˆM A ∗B +N ∗ + L + 1 − A ∗ + ˆM. (7)BConsequently, the mean of the FDLs carried traffic is givenbyP · ¯M = P · ˆM − ˆMB (8)Now, considering the feedback connection at the input, theaggregation of the actual input flow F I at a single port, ofmean M I , and the carried traffic flow ¯F givesM = M I + ¯M, (9)where M, M I and ¯M are means of flows F, F I and ¯Frespectively. We solve equation (9) numerically using a simplebisection method with initial bounds determined as follows.The lower bound on M is taken as M I , as we are assuredthat the feedback flow ¯M makes M > M I . The upper boundon M is taken as 2M I as the feedback flow intensity cannot begreater than M I . Finally, having computed the FDLs overflowintensity mean, we calculate the node blocking probability asB. Full Wavelength ConversionB = ˆM B /(P · M I ). (10)We are now assuming that the optical switch of Fig. 1 hasfull wavelength conversion capability. In this case, a burst willbe blocked if and only if all output channels are busy, thereforethe output port can be modeled as an M/M/N/N queue. Theanalysis follows the above approach but with overflow fromone output port now given byˆM = M · E(M,N)()(11)ˆV = ˆM 1 − ˆM M+N + 1 + ˆM − M. (12)The remainder of the analysis is the same. Given ˆM and ˆV ,the overflow from the FDL bank is calculated by (4), (5), (6)and (7). Finally equation (9) is iterated to the solution and theblocking probability calculated with Equation (10).168


Blocking Probability10 -210 -310 -410 -510 -1 Offered Traffic Intensity per Wavelength ChannelR = 1R = 3R = 5R = 10SimulationAnalysisBlocking Probability10 -1 Offered Traffic Intensity per Wavelength Channel10 -210 -310 -410 -5R = 1R = 3R = 5R = 10SimulationAnalysis10 -60.2 0.3 0.4 0.5 0.6 0.7 0.8 0.910 -60.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9Fig. 4. Blocking Probability - 10 Channels per Output Port - 1,3,5 & 10FDLs each with one Channel - No Wavelength ConversionFig. 5. Blocking Probabilit4 - 10 Channels per Output Port - 1,3,5 & 10Channels on single FDL - Full Wavelength ConversionIV. RESULTS AND ANALYSISWe compare analytic results for blocking B with resultsfrom a discrete-event simulation of an OBS node implementedin Opnet Modeler TM . We consider a node with four ports andthree different node configurations, (i) a node with 10 outputchannels with no wavelength conversion (Fig. 4), (ii) a nodewith 10 output channels and full conversion (Fig. 5) and (iii) anode with 40 output channels and full conversion (Fig. 6). Forcase (i) we compare results for different numbers of singlechannelFDLs. For cases (ii) and (iii) we compare results fordifferent numbers of channels in a single FDL.We can observe that the values of blocking probabilitycalculated through analysis for a small number of total FDLchannels are in good agreement with the results obtainedfrom simulation. For larger numbers of FDL channels theresults provide an approximate lower bound for the fullwavelength conversion case. The inaccuracy can be, at leastpartly, attributed to the fact that as the number of FDL channelsincreases the proportion of the non-Poisson traffic offered tothe output channels (which we have assumed Poisson) willincrease. Additionally, this traffic tends to be peaked as it ismainly characterised by the overflow traffic from the outputport, and peaked traffic will result in higher actual blockingprobability. For the case of no wavelength conversion and highoffered load, the opposite occurs with the blocking probabilitybeing over estimated. In this case, the carried traffic from theFDLs is in reality smoother than Poisson due to the singlechannelFDLs being heavily loaded enough to smooth thepeaked offered traffic from the output ports.V. CONCLUSIONSIn this paper we have proposed an approximate analysisfor calculating the blocking probability in an OBS switchequipped with FDLs and wavelength converters using EquivalentRandom Theory and assuming Poisson input traffic. Theaccuracy of the proposed analysis may be further improvedif the input traffic were modelled by two moments, which ispossible with ERT. This will be a topic for future work.Blocking Probability10 -3SimulationAnalysis10 -40.7 0.74 0.78 0.82 0.86 0.9010 -2 Offered Traffic Intensity per Wavelength ChannelR = 4R = 8R = 12Fig. 6. Blocking Probability - 40 Channels per Output Port - 4,8 & 12Channels on single FDL - Full Wavelength ConversionREFERENCES[1] C. Qiao, M. Yoo, “Optical burst switching (OBS) - a new paradigm foran optical Internet,” Journal of High Speed Networks, vol. 8, no. 1, pp.69-84, January 1999.[2] F. Callegati, “Optical Buffers for Variable Length Packets,” IEEECommunications Letters, Vol.4, n.9, September 2000.[3] W. Rogiest, D. Fiems, K. Laevens, H. Bruneel, “Exact PerformanceAnalysis of FDL Buffers with Correlated Arrivals,” IFIP InternationalConference on Wireless and Optical Communications Networks, 2007.[4] P. Reviriego, A. M. Guidotti, C. Raffaelli, J. Aracil, “Blocking modelsof optical burst switches with shared wavelength converters: exactformulations and analytical approximations,” Photonic Network Communications,vol.16, issue 1, 2008.[5] Z. Rosberg, A. Zalesky, H. L. Vu, M. Zukerman, “Analysis of OBS NetworksWith Limited Wavelength Conversion,” IEEE/ACM Transactionson Networking, vol. 14, no.5, pp. 1118-1127, October 2006.[6] C. M. Gauger, H. Buchta, E Patzak, “Integrated Evaluation of Performanceand TechnologyThroughput of Optical Burst Switching NodesUnder Dynamic Traffic,” Journal of Lightwave Technology, vol. 26, n.13, pp. 1969-1979, July 2008.[7] A. Rostami, A. Wolisz, “Modeling and Synthesis of Traffic in OpticalBurst-Switched Networks,” Journal of Lightwave Technology, vol. 25,n. 10, pp. 2942-2952, October 2007.[8] A. Girard, Routing and Dimensioning in Circuit-Switched Networks,Addison-Wesley Longman Publishing Company, 1990.[9] ITU-D, Study Group 2, Teletraffic Engineering Handbook,URL:http://www.itu.int/ITU-D, last visited May <strong>2009</strong>.[10] L. Kleinrock, Queueing Systems Volume I:Theory, John Wiley & SonsInc, 1975.169


Section 4AANTENNAS AND CIRCUITS 2170


Design of Integrated Stacked Spiral Thin-Film TransformerBased on Silicon SubstrateLiang Zheng 1,2 , Huibin Qin 2 , Stephen Daniels 11.National Centre for Plasma Science and Technology,Dublin City University,Dublin, Ireland2. Institute of Electron Device & Application,Hangzhou Dianzi University,Hangzhou, ChinaE-mail:zhlbsbx@hotmail.comAbstractIn this paper, a kind of integrated stacked Spiralthin-film transformer is presented. A lumped model isused to simulate the characteristic of the integratedtransformer. The structure and manufacture process oftransformer are shown based on IC technology. S-parameters are measured at 10MHz-20GHz. Theresult shows that the transmission efficiency of air corethin-film transformer more than 50% at 1GHz-20GHz,and the maximum is 89% at 20GHz.1. IntroductionIn the past several decades, because of the rapiddevelopment of electronic technology, high-speedelectronic devices get wide spread use in radiofrequency circuits and other systems, such asBluetooth, GSM Mobile, WLAN, UWB, etc. At thesame time, people need smaller electronic aid for moreconvenient living, it means the modern electrondevices should being miniaturized, integrated and usedin higher operating frequency. Now the transistor,resistor, capacitance and inductance at nH level havebeen integrated on chip successfully. However, theintegrated technology of transformer has not wellstudied, which limits the development of electronicsystems.As indispensable passive components, transformerhas intrinsic advantages relative to active devices onsignals insulation and transmission, signals synthesisand conversion. The size of transformer is beingreduced; miniature transformers like planartransformer and thin-film transformer are presented in90’s of last century. In 1993, Japanese K.Yamaguchi[1]presented a spiral coil type thin film microtransformerwith the size of 2.4*3.1 mm 2 fabricated usingphotolithography techniques. An amorphous magneticfilm of a multilayered CoNbZr/Si02 was used asmagnetic core. In 2004, Irish Terence O’Donnell[2]designed a transformer having 80 μm thick copperconductors and a two-layer laminated magnetic core.The transformer had an efficiency of 78% at an outputpower of 3.5W in 5MHz-10MHz. In addition, peopleused other technologies to fabricate thin filmtransformer [3, 4].However, these miniature transformers hasdisadvantages such as chip areas of transformerfabricated by multilayer printed technology are still toolarge compare with size of RFIC chip, technique oftransformer using MEMS technology is complexity,etc. The most difficulty is the transmission efficiencyisn’t high enough in particular at high frequency. Inthis paper, a stacked spiral thin-film transformerworking on high frequency based on IC technology isdesigned and prepared.2. Equivalent circuit model of Thin-filmTransformerThere are a number of parasitic effects on thin-filmtransformer at high frequency, such as the capacitivecoupling between primary inductance spiral andsecondary inductance spiral, the eddy loss of substrate,and so on. Stacked transformer, which has twooverlapped spiral with same shape, can provide highself-inductance and high coupling because of itsvertical and lateral magnetic coupling. A simplylumped circuit model can simulate the characteristic oftransformer, and describe physical significance of theparasitic effects intuitively. The equivalent circuitmodel [5] is showed in figure 1.171


Primary SpiralSecondary SpiralCoCbLC tRCo1 4 3 5 2Terminal 1 R L+M L+M RTerminal 21 -M1C bC t26 2Cm(a)(b)Figure1. Simply lumped equivalent circuit model of thin-film transformerIn this equivalent circuit model, L models the selfinductance of the primary spiral and secondary spiral,M is the mutual coupling coefficient, C Ois couplingcapacity between two spirals, Cbis coupling capacitybetween primary spiral and substrate, C tis couplingcapacity between secondary spiral and substrate. Thecomputing formula is showed below:εCb= lw2tb(1)ε A − AovCt= lw 2 ttA(2)Cm= Ct+ Cb(3)AovCo= ε (4)lwtt−bA(5)ρlR =−t/ δδw(1− e )M = kL(6)2μnd 2L =avg[ ln(2.46 / ρ)+ 0.2ρ]2(7)In these formulas, l is the length of the spiral, μ ispermeability, ε is dielectric constant, ρ is metalresistivity, W is width of the spiral, δ is skin depth, kis mutual coupling coefficient (k of stackedtransformer is about 0.7-0.9), R is series resistance, Ais chip area of inductor spiral, A ov is overlapped areaof two spirals.From the equivalent circuit model, it is easy to findthat reduce the parasitic capacity, which is caused byspirals and substrate at high frequency, is the key pointto improve the transmission efficiency of transformer.3.Structure and Fabrication sequences ofThin-film TransformerThere are good experiences to reduce the parasiticcapacity in preparation technology of integratedinductor [6, 7]. High-resistance substrate and thicksubstrate are adopted to reduce parasitic capacitor dueto substrate loss of transformer. In this paper, an aircorestacked thin-film transformer is designed. Theprimary coil and secondary coil are overlappedcompletely. Overlapping coils has high magneticcoupling which can transmit energy more effective, thestructure is showed in fig.2.SiO 2SiO 2SiO 2SiO 2Si 3N 4SiO 2(a)(b)Fig. 2 Structure of thin-film transformer(a)cross-sectionview, (b)topview172


The whole structure consists of 10 layers besides theSilicon substrate.1) Layer 1 is SiO 2 . It supplies the whole device andisolates Si substrate. Thermal oxidation is used tofabricate SiO 2 .The thickness of SiO 2 layer is 1μm,thick oxide can reduce the parasitical capacitor.2) Layer 2 is Si 3 N 4 . PECVD is used to fabricate. Thethickness of Si 3 N 4 layer is 120 nm. The electricalresistivity of Si 3 N 4 is 10 15 Ω•cm, substrate with highlyelectrical resistivity can reduce substrate loss.Furthermore, Si 3 N 4 has low-fluidity and difficult tocorrupt, these characteristics can improve thecompatibility of subsequent etching process.3) Layer 3 is Metal1. Metal1 is lead wire usingaluminum. DC magnetron sputtering is used tofabricate Al. Two lead terminals of primary spiral areprepared. The thickness of Al layer is 0.7 μm.4) Layer 4 is SiO 2 . SiO 2 with 1μm thickness isolatelead wire and primary spiral and fabricated by PECVD.5) Layer 5 is primary spiral. DC magnetron sputteringis used to fabricate Al with 1μm thick.6) Layer 6 is SiO 2 . PECVD is used to prepare SiO 2 .The thickness is 1μm.7) Layer 7 is second spiral. DC magnetron sputtering isused to fabricate Al with 1μm thick.8) Layer 8 is insulation layer. PECVD is used tofabricate SiO 2 layer with 1.2 μm thickness.9) Layer 9 Metal4 is also lead wire using aluminum.DC magnetron sputtering is used to fabricate Al. Twolead terminals of primary spiral are prepared. Thethickness of Al layer is 1.2 μm.10) Layer 10 is passivation layer. PECVD is used tofabricate SiO2 layer with 1.2 μm thickness.Fabrication sequence is showed in figure3.Heat 氧 treatment 化PECVDSputtering Metal3Sputtering FerritHeat treatmentphotoetching12μm, the spiral distance line is 3μm, the thickness ofAl layer is 1μm, and the spiral area are 0.5*0.5mm 2 .The transmission characteristics are measured at10MHz-20GHz by vector network analyser AgilentPNA E8363B and probe Cascade Microtech ACP GSG.The measured results are showed in Fig.4.5. ConclusionsFigure 4. 5:5 transformer’s S(2,1) curveA thin-film transformer based on Si IC technologywas presented. High-resistance substrate and thicksubstrate are adopted to reduce substrate loss. Testresult shows the transformer can obtain the maximaltransmission efficiency 89% at 20GHz. The thin-filmtransformer has advantages like small in size,operating at high frequency, suitable for massproduction, and so on. It can be used in RFIChopefully.6. AcknowledgementThis paper is supported by National Nature ScienceFoundation of China (No.60601022, NO.60671024)and Natural Science Foundation of Zhejiang Provinceof China (No. Y107205 and No. Y407133).Sputtering Metal1photoetchingPECVD7. ReferencesphotoetchingPECVDSputtering Metal2photoetchingphotoetchingphotoetchingFigure3. Fabrication sequences of the thin-film transformer4. Measurement resultsA kind of thin-film transformers is measured,. The turnnumbers of two spirals are all 5, the spiral width line is[1] Yamaguchi K, Sugawara E, Nakajima O. “Loadcharacteristics of a spiral coil type thin filmmicrotransformer”. IEEE Transactions onMagnetics.vol. 29, pp. 3207-3209. Nov.1993.[2] Terence O’Donnell, Ningning Wang, Magali Brunet,Saibal Roy, et al. “Thin film micro-transformers forfuture power conversion”. APEC ’04. NineteenthAnnual IEEE, vol. 2, pp. 939-944. 2004.[3] Hiroaki Tsujimoto, Ieyasu O. “High frequencytransmission characteristic of co-planar film transformerfabricated on flexible polyamide film”. IEEEtransactions on magnetics, vol. 31, pp. 4232-4234. Nov.1995.173


[4] A H Miklich, J X Przybysz, T J Smith.“Superconducting thin-film transformers at microwavefrequencies”. IEEE transactions on appliedsuperconductivity, vol.9, pp. 3062-3065. Jun. 1999.[5] Mohan S S The design, modeling and optimization ofon-chip inductor and transformer circuits[D].StanfordUniversity. 1999:128-129.[6] Kirk B Ashby, Ico A Koullias, William C, et al. “HighQ inductors for wireless applications in acomplementary silicon bipolar process”. IEEE Joural ofsolid-state circuits, vol. 31,pp. 4-9. Jan. 1996.[7] H B Erzgraber, Th Grabolla, H H Richter, et al. “Anovel buried oxide isolation for monolithic RF inductorson silicon”. IEDM ’98 Technical Digest: 535-539.Dec.1998.174


Equivalent Circuit Modeling for On-Wafer Interconnects on SiO 2 -SiSubstrateJun Liu 1,2 , Lingling Sun 1 , Huang Wang 1 , Liheng Lou 1 , Charles McCorkell 21 Key Laboratory of RF Circuits and Systems, Ministry of Education, Hangzhou DianziUniversity, Hangzhou, China2 School of Electronic Engineering, Dublin City University, Dublin 9, Irelandljun77@163.comAbstractA new equivalent circuit for on-wafer interconnectsmodeling is presented. The skin effect, fringing effect,and substrate losses have been considered. Anadditional element is introduced to predict thesubstrate capacitive couple effect smartly. Besides, anenhanced de-embedding method was introduced toremove the parasitics from other test structures, andde-embedded measured data are used to analyticallyextract the associated model parameters. The accuracywas demonstrated by the on-wafer Y-parametermeasurements of interconnects up to 40GHz,fabricated on the top-metal layer employing SMIC0.18µm RF-CMOS technology.1. IntroductionDue to the combination of the increased circuitcomplexity and higher frequencies, modern integratedcircuit (IC) performance becomes more and moresubjected to the interconnect behaviors [1-3]. Accuratemodeling and analysis of the interconnect structure areessential to the realization of the next generation ofhigh performance IC's. Many interconnect modelingmethods such as electromagnetic (EM) based modelingapproach [4-6], measurement based modeling approach(i.e. Lumped equivalent circuit modeling method) [7-9]have been developed in literature. Compared with theEM-based modeling approach, the measurement basedmodeling approach has more silicon-verified accuracyand more computational efficiency [10]. When ashorter subsection model is well established, equivalentcircuit model of longer wires can be obtained bycascading shorter subsections together conveniently.Several equivalent models of interconnects havebeen reported [7-9], part of them were so simple thatmany effects such as fringing, skin, loss of substratehave not be described, these models can not be used inRF/ mm-wave region [7, 8]. The extraction ofassociated parameters of some reported models [9]have to be performed using optimization and fittingprocedures, which may result in nonphysical extractedvalues. Besides, since the accuracy of measurementscan affect the accuracy of extracted model parametersseriously, traditional de-embedding methods such asOSD (open-short-de-embedding), POSD (pad-openshort-de-embedding),OTD (open-through-deembedding),OSTD (open-short-through-deembedding)can not remove all the test structuresparasitics, especially the pad-stub-line discontinuities.In this paper, a more accurate equivalent circuitmodel for interconnects was presented with analyticalparameters extraction method. A novel component wasintroduced to characterize the substrate capacitivecouple effect along with the line at high frequency.Besides, an enhanced de-embedding method based ontransmission line theory [11] was developed to removeparasitic from the pads and other test structures. Inorder to substantiate the proposed modeling method,two test structures are designed and fabricated using a0.18um RF CMOS process by SemiconductorManufacture International Corporation (SMIC) andcharacterized up to 40GHz.2. De-embedding Method DescriptionBy simply modeling the left and right pads solely aslumped admittances YL and YR (YR=YL, for thereason of symmetric), and lumping pad-linediscontinuities together with the pads, Alain M.Mangan et al. developed a novel de-embed thecontributions of parasitic structures frominterconnect/transmission line measurements technique175


ased on the ABCD transmission matrix theory [11]. Inactual application instances, when the length ofmeasured interconnect line is much longer than the sizeof pads structure, [11] can be accurately implementedto high frequency. But, when one’s length becomecomparable to the size of the pads, solely modeling theleft or right pad structure as a lumped Y-admittancewill introduce error in the interconnect line parametersextraction, because of the existence of parasiticbetween the two pads.In fact, consider a real-world test structure of ainterconnect line, Fig. 1 reveals that the transmissionmatrix of either test structure can be decomposed into acascade of 6 two-port networks (but not 5 two-portnetworks introduced in [11]) consisting of the two pads(modeled by YL and YR), parasitic between the twopads (modeled by YF), the intrinsic device, and theassociated pad-line discontinuities. Assuming that thepad-line junctions can be modeled by lumpedadmittances, parasitic introduced by the two pads canbe removed by a pad-open structure using traditionalde-embedding theory, a more accurate de-embeddingmethod similar to [11] can be derived as follows:Consider two interconnect line test structures oflength l 1 and l 2 ( l 1 > l 2 ) with pad-open structurerespectively (Fig.2), implement a de-open procedure toremove the pads parasitics (e.g. YL, YR and YF inFig.1.(b)) from the measurements at first, thentransform the de-embedded measurements to ABCDmatrix:Fig. 1 (a) Composition of a interconnect line test structureand (b) a possible equivalent representation for the two pads.Ymeasured_ l i = stoy[ Smeasured_ li]− stoy[Smeasured_ openi](1)Mmeasured_ l i = ytoabcd[ Ymeasured_ li], ( i = 1, 2 ) (2)The ABCD transmission matrix of test structurel i , Mmeasured_ lican be represented as the followingproduct:M measured _ l i = M D1× Mli× M D2(3)whereM l i ABCD matrix represents the intrinsic linesegment of structure;M D1 ABCD matrix represents, the left pad-linejunction;M D2 ABCD matrix represents the right pad-linejunction.Consider M l1multiplying with the inverse of M l2,define Mmeasured_ lias the hybrid “structure”M measured _ l1× Inverse[ Mmeasured_ l2 ] and M l1−l2as a linesegment of length l1 − l 2 :M measured _ l1−l2= M measured _ l1× Inverse[ M measured _ l2 ]= M D1 × Ml 1−l2× Inverse[ M D1](4)Assuming that the left pad-line discontinuity can bemodeled solely by a lumped admittance YD, then:⎛ ⎡ 1 0⎤⎞M D⎜ ⎟1 =⎢ ⎥(5)⎝ ⎣YD1⎦⎠⎛ ⎡ 1 0⎤⎞⎛ ⎡ 1M measured⎜ ⎟ × × ⎜_ 1−l 2 = ⎢ ⎥ Ml 1−l 2⎢⎝ ⎣YD1⎦⎠⎝ ⎣−YD0⎤⎞⎥⎟1⎦⎠l (6)Under the lumped pad assumption, the hybrid“structure” can be expressed in terms of parameters, asa parallel combination of the intrinsic transmission lineand the parasitic lumped pads:⎡YD0 ⎤Y measured _ l 1−l2= Yl1−l2+ ⎢ ⎥ (7)⎣ 0 − YD⎦where Y measured _ l1−l2 is the Y-parameter representationof M measured _ l1−l2, Y l1−l2is the Y-parameterrepresentation of M l1−l2.Since the intrinsic device is symmetric, its Y-parameters can be isolated by connecting Y measured _ l1−l2in parallel with a port-swapped version of itself, thuscanceling out the effects of the pads:Ymeasured _ l1−l2+ Swap[ Y measured _ l1−l2]Y l1−l2=(8)2where⎛ ⎡a⎞ ⎛ ⎡ ⎤⎞⎜ 11 a12⎤ a⎟ = ⎜ 22 a21Swap ⎟⎢ ⎥ Swap⎢ ⎥ (9)⎝ ⎣a21a22⎦⎠⎝ ⎣a12a11⎦⎠3. Interconnect ModelingA. Equivalent Circuit ModelWith frequency increasing, the frequency dependenteffects of the interconnect line, such as skin effect,176


substrate effect and distributed effect begin to impactthe performance. In anther words, characteristics ofinterconnects in RF/mm-wave region are frequencyvariant.Unfortunately, the current commercial circuitsimulators (such as H/P-SPICE, Agilent AdvancedDesign System, Cadence) do not support simulations ofthe frequency-variant parameters. And the distributednature of the component is easily accommodated sothat the compact model can operate over an extendedfrequency range (i.e., into the mm-wave region)[12].Thus, employing equivalent circuit model tocharacterize these effects with frequency-independentcomponents becomes a technology of more importance.Fig. 2 (a) Two interconnect line test structures with pad-openstructures respectively and (b) multiplying the transmissionmatrix of line 1 with the matrix inverse of line 2 afterremoving the pad-open parasitics.Fig. 3 Equivalent circuit model.Because the lengths ( l1 − l2) of interconnect line weresearched here is less than 174um, according to [13],the one- Π model should be capable of preciselycharacterizing the high frequency behaviors ofinterconnects. Fig. 3 shows the equivalent circuit modelproposed for interconnect lines fabricated in silicontechnology. A novel component “ C couple ” was broughtin to predict the substrate capacitive couple effect andpower storage in the substrate below interconnect lineat high frequency. The parallel components R sk and L skare introduced to characterize the skin effect referenceto [9]. The series components C , and Rsub2subcombined with C are used to characterize thesub1parasitic of substrate such as loss effect, couple effectand others.B. Calculation of the Model ParametersAssuming that the parasitics of the pads and the padlinejunctions have been removed using the deembeddingmethod proposed in section II. Then theequivalent circuit illustrated by Fig. 3 can becharacterized as three admittances, these can bedetermined from:Y 1 = Yl1 − l2.11+Yl1−l2.12(10)Y 2 = Yl1 − l2.22+ Yl1−l2.21(11)Y 3 = −Y2.12(12)l1−lwhere Y l1−l2is the Y-parameter of intrinsic line l1 − l2mentioned in section II.Furthermore, from Y-parameters of the equivalentcircuit in Fig. 3, the following equations can bederived:Y1 = Y 2 = Real( Y1) + j ⋅ Im ag( Y1)(13)whereR(14)subReal( Y1)=2 1 2Rsub+ ( )ωCImag( Y1)= ωCsub1+Rsub21ωCsub21+ ( )2 2subωCsub2(15)As seen from Fig. 3, follow equations can be derived:−Y3= [ −Y.12] = jωC+ [ Re al(Z3)j Im ag(Z3] 1(16)wherel 1 − l2couple+ )2 2ω LskRskRe( Z3)= Rs+(17)2 2 2Rsk+ ω Lsk2 2ω LskRskIm( Z3)= ωLs+(18)2 2 2Rsk+ ω LskAt low frequency, skin effect is not significant,below a given frequency (10GHz, in our work), theparallel components R sk and L skin Fig. 3 can beneglected, and thus, real and imaginary parts of Y3 canbe derived as follows:177


RRe( Y3 _ LowFreq)=s(19)2 2 2Rs+ ω LsωLIm( Y3 _ LowFreq)= ωCscouple −(20)2 2 2Rs+ ω LsThe above expressions for Y1, Y2 and Y3/Z3 containall the information required for extracting all theparameters of the model. Taking the related Y-parameters into account, the extracted values aredetermined as:1) Rearrange (14) as follows:2ω 2= ω K1+ H(21)Real( Y i)whereK= R(22)1 sub12R sub C sub2H = (23)Using (21), Rsubcan be extracted from the slope ofthe linear regression of the experimental2 2ω / Real( Y i) versus ω . After subtracting R sub , (23)gives Csub2. Thus, C sub1can be calculated as follows:⎛1 ⎞⎜1 ωC⎟ (24)−sub 2Csub1= ω ⋅⎜Im ag( Y1)−⎟2 1 2Rsub( )⎜+ωC⎟⎝sub2⎠2) Rearrange (19) as follows:21L= Rss + ω2(25)Re( Y3 _ LowFreq)RsUsing (25), Rscan be extracted from the intercept ofthe linear regression of the experimental12versus ω . After subtracting R s , L s canRe( Y3 _ LowFreq)be calculated from the slope. Once Ls and R s aresubtracted, C couple can be determined by using (20).The elements ( R sk and L sk) introduced to representthe skin effects are extracted from Y-parameters at highfrequency. As C couple , Rsand Ls has been subtracted, byusing (18), Real(Y3) can be rearranged as follows:−Y3= [ − Yl 1 − l2.12] = jωCcouple+ [ Re al(Z3)+ j Im ag(Z3)] 1 (26)[ Re al(Z3)− R ]− 1 =sk+ (27)ωsR 12 2Lsk RskThus, R sk can be extracted from the intercept of thelinear regression of the experimental [ Re al ( Z3)− R ]− 1−2versus ω , and after subtracting R sk , Lskcalculated from the slope.scan be4. Measurements and VerificationTwo interconnect test structures are fabricatedemploying a SMIC 0.18µm 1P6M RF CMOStechnology. The width of the two lines is fixed at10µm, and the length is 150µm ( l 1 ) and 50µm ( 2l ),respectively. Two-port measurements were performedwith the Agilent E8363B Network Analyzer and aCASCADE Summit probe station. The measurementswere calibrated using the short-load-open-through(SLOT) algorithm provided with the VNA. The deembeddingand parameters extraction procedure iscarried out as described above. The extractedparameter values are summarized in Table 1.Simulations of the proposed equivalent circuitmodel based on the extracted parameter values areperformed in ADS2005A. The simulated Y-parametersare compared with the corresponding measured onesand the plots are presented from Fig. 4 to Fig. 7.Considering the symmetry of the physical structure, Y-parameters are thus symmetrical, i.e. Y11=Y22 andY12=Y21 [14]. Therefore, only Y11 and Y12 arecompared, and Real(Yij) and Imag(Yij) are plottedseparately.5. ConclusionAn improved de-embedding methodology which canaccurately remove pad parasitics and pad-linediscontinuities is presented for lossy RF-CMOSinterconnects modeling. A novel one- Π equivalentcircuit model for RF on-wafer interconnects isproposed. Based on the S-parameter measurements andwith an analytical parameter extraction procedure, themodel is constructed and element parameters areextracted efficiently. The accuracy of the model basedon the extracted parameter values has been verified bythe excellent agreement between the simulated resultsand the measurement data. It demonstrates that theimproved de-embedding method and the proposedequivalent circuit model are capable of accuratelycharacterizing the behaviors of RF-CMOSinterconnects.Table 1 Extracted values for the l1 − l2segment modelL s 79pH C sub1 2fFR s 0.285Ω C sub2 5.7fFL sk 13.7pH C couple 3.405fFR sk 1.52Ω R sub 2.089kΩ178


Real(Y 1,1 )Imag(Y 1,1 )Real(Y 1,2 )432100 5 10 15 20 25 30 35 40Freq (GHz)freq, GHzFig. 4 Comparisons of simulated and measured Real(Y 1,1 )0.0-0.5-1.0-1.5-2.00 5 10 15 20 25 30 35 40Freq (GHz)Fig. 5 Comparisons of simulated and measured Imag(Y 1,1 )real(Optim1.SP1.SP.Y(1,2))0-1-2-3-4----- : Measuredxxx : Simulated----- : Measuredxxx : Simulated----- : Measuredxxx : Simulated0 5 10 15 20 25 30 35 40Freq (GHz)Fig. 6 Comparisons of simulated and measured Real(Y 1,2 )[3] W. Sun, W. Dai, W. Hong, “Fast parameter extraction ofgeneral interconnects using geometry independent measuredequation of invariance,” IEEE Trans. MTT, 1997,45(5):827.[4] F. Alimenti, V. Palazzari, P. Placidi, G. Stopponi, A.Scorzoni, and L.Roselli, “Analysis of CMOS interconnectscombining Le-FDTD method and SoC procedure,” IEEEMTT-S Dig, Seattle, WA, 2002, 2:879.[5] R. H. Havemann, J. A. Hutchby, “High-performanceinterconnects: An integration overview,” <strong>Proceedings</strong> of theIEEE, 2001, 89(5):586.[6] D. A. White, M. Stowell, “Full-wave simulation ofelectromagnetic coupling effects in RF and mixed-signal IC’susing a time-domain fi- nite-element method,” IEEE Trans.MTT, 2004, 52(5):1404.[7] B. Kleveland, X. Qi, L. Madden, “High-frequencycharacterization of on-chip digital interconnects,” IEEETrans. Solid-State Circuits, 2002,37:716.[8] B. Kleveland, C. H. Diaz, L. Madden, et al. “ExploitingCMOS reverse interconnect scaling in multigigahertzamplifier and osillator design,” IEEE Trans. Solid-StateCircuits, 2001,36:1480.[9] Xiaomeng Shi, Jian-Guo Ma,et.al.”Equivalent CircuitModel of On-Wafer CMOS Interconnects for RFICs”, IEEETrans. Very Large Scale Integr. (VLSI) Syst, 2005,13(9):1060.[10] X. Shi, J. Ma, B. H. Ong, K. S.Yeo, M. A. Do, and E.Li, “Equivalent circuit model of on-wafer interconnects forCMOS RFICs,” Radio and Wireless Conference, 2004: 95.[11] Alain M. Mangan, Sorin P. Voinigescu, “ De-Embedding Transmission Line Measurements for AccurateModeling of IC Designs”, IEEE Trans. Electron Devices,2006, 53(2):235.Imag(Y 1,2 )imag(Optim1.SP1.SP.Y(1,2))2.01.51.00.50.00 5 10 15 20 25 30 35 40Freq (GHz)Fig.7 Comparisons of simulated and measured Imag(Y 1,2 )References----- : Measuredxxx : Simulated[1] R. Hossain, F. Viglione, M. Cavalli, “Designing fast onchipinterconnects for deep submicrometer technologies,”IEEE Trans. Very LargeScale Integr. (VLSI) Syst, 2003,11:276.[2] Q. Xu, P. Mazumder, “The fifth-order differentialquadrature methods,” IEEE Trans. Very Large Scale Integr.(VLSI) Syst, 2003, 11: 1068.179


Section 4BeLearning180


Assessing Power Saving Techniques and TheirImpact on E-learning UsersArghir-Nicolae MoldovanSchool of ComputingNational College of IrelandMayor Street, Dublin 1, IrelandE-mail: amoldovan@student.ncirl.ieCristina Hava MunteanSchool of ComputingNational College of IrelandMayor Street, Dublin 1, IrelandE-mail: cmuntean@ncirl.ieAbstract—As mobile devices are constantly improving,they have started to be used for online learning. If thedelivery of educational content involves multimediastreaming, additional pressure is put on the battery life,causing it to discharge faster. Running out of battery lifeduring a learning session can have a negative impact onthe learner’s Quality of Experience. This paper suggeststhe fact that adaptive e-learning environments shouldbecome energy-aware and power saving techniquesshould be considered in order to assist the learner in alow power situation. The impact of various factors onthe battery life was analysed and the results show thatsignificant power saving can be achieved by applyingdifferent techniques during the multimedia streamingprocess. However, subjective tests conducted on a smallgroup of participants show that some of these techniqueshave a negative impact on end-user perceived qualityand may not be suitable for an e-learning environment.Keywords—power saving; e-learning; multimediacontent adaptation; subjective quality evaluationOI. INTRODUCTIONver the past few years, technology has become a part ofour life like never before. More and more people use itas a way to stay in touch with friends, to access information,to work, to study, or for entertainment. With the help of thenew technologies, e-learning has become an increasinglyimportant form of education, and many educationalinstitutions have extended their activity on the Internet.At the same time, users in general, learners in particular,have already become more oriented through mobility. Arecent study report published by the Educause Centre forApplied Research shows that laptop ownership increasedamong undergraduate students from 65.9% in 2006 to82.2% in 2008 [1]. The study was conducted in 44universities and colleges in the USA.Mobility comes with a lot of advantages for the learnersbut also with a number of limitations. A major limitation isthat, while on the move, learners rely mainly on the mobiledevice battery power supply. Given the fact that anincreasing number of e-learning environments haveincluded multimedia content in their applications, accessinga course over the wireless network can quickly drain thepower from the battery. If learners have to stop theiractivities due to low power situations, not only their Qualityof Experience (QoE) is affected, but also the learningoutcome may be significantly reduced.In this context, e-learning environments have to becomeenergy aware and to assist the learners in maximising theirlearning outcome. Therefore, power saving techniques mustbe integrated with the e-learning applications. Battery lifewill be extended by personalising the educationalmultimedia content depending on the available powerresources of each particular mobile device that is used.This paper presents an experimental evaluation of someof the factors that have an impact on the battery powerconsumption when multimedia content is delivered tomobile devices. It also looks at different actions that can betaken in order to reduce the power consumption. Theseactions can be integrated into a power saving mechanismwhich will extend the functionality of the adaptive e-learning systems.Considering the fact that users are becoming increasinglyquality-aware, a subjective testing was conducted on agroup of participants in order to assess how their perceivedquality is impacted when power saving actions are applied.The paper is structured as follows. Section two consists ofa literature review of the existing power saving techniquesin the wireless communication area and previous research inthe area of adaptive learning. The paper continues withexperimental results assessing the impact of various factorson battery power consumption, but also with the results ofthe subjective evaluation. In the end conclusions are drawnand future directions are presented.II. RELATED WORKIn the last decade, adaptation and personalisation havegradually been brought to the forefront of research and as aresult a large number of applications in the technologyenhancedlearning area have been proposed. Various aspects181


were investigated as important input in the personalisationof the course material and learning process.Content personalisation may be driven by the needs ofindividual learners, their preferences, goals [2], knowledgelevel [3] or cognitive preferences [4]. Personalisation of thelearning process may be driven by the user's abilities,motivation and their previous interaction with e-learningenvironment, as well as by the learner's concentration leveland frequency of disruptions [5].Most of the research in the area of adaptive e-learning hasconcentrated on delivering personalised educational contentbased on learner characteristics. More recently researchfocused on proposing solutions for multimedia contentadaptation according to learner device type andcharacteristics [6], [7], network type and conditions [8] anduser Quality of Experience (QoE) [9].Although much research was performed in this area andthere are clear evidences that the available power of themobile device could affect the learning process, no studyhas considered the learner device battery level in thepersonalisation process of the multimedia content, apart ofother characteristics.Batteries have improved over the last few years, but still acombination of multimedia tasks that are using at the sametime multiple components simultaneously (such as screenand speakers, CPU, memory and the Wireless NetworkInterface Card (WNIC)) may drain the battery powerquickly.However, battery life is not a new issue and many powersaving techniques in wireless communications have beenproposed over the time. For the particular case ofmultimedia streaming, the proposed solutions can beclassified in the following categories:1) Power saving in the reception stage of themultimedia stream. These solutions look at sending andreceiving data and mainly focus on maintaining WNIC in alow power state for a longer period. A proposed solution isto send video frames in a bulk based on the network trafficshape instead of sending them individually. Thus datawaiting time for the device is reduced [10]. Anotherproposed solution uses periodic bulk transfer of the videodata in order to reduce the working time of the wirelesscard, combined with a decrease of the video quality at anintermediate proxy node [11]. Existing research shows thatsignificant improvements in the battery life can also beachieved by extending the power saving mechanism,already built in the IEEE 802.11 standard [12]. This solutionuses an additional buffer to hide the data corresponding toseveral beacon intervals, from the station it is intended for,and forcing it to return to sleep. The buffered data is finallyreleased at once to the mobile station after several attemptsto receive it.2) Power saving in the decoding stage. A study [13]has proposed to use dynamic online feedback for setting theaverage frame decode rate to the same value as the displayrate in order to save power in the decoding stage.3) Power saving in the playing stage. The majority ofthe solutions that have been proposed for saving power inthe reception stage have focused on the device screen.Battery power consumption can be reduced by optimisingthe backlight power consumption [14], or by extending theDynamic Luminance Scaling (DLS) to cope withtransflective LCD panels that can operate with or withoutbacklight [15] depending on the battery level and ambientluminance. Power save can also be achieved by adjustingscreen brightness and volume level but since the user hascontrol on these settings, little research has been performedin this area.Literature review shows that numerous power savingtechniques with good results have already been proposed.The novelty of this research consists of the fact that it willbridge the well researched area of personalised e-learningwith the power saving in the wireless communications andwill look not only to increase the battery life, but also toimprove the learning process.III. FACTORS THAT INFLUENCE BATTERY POWERCONSUMPTION OF THE MOBILE DEVICESPlaying multimedia educational content on a mobiledevice is an energy intensive task, especially when this isstreamed over a wireless network. In this situationsignificant power is consumed by the WNIC for retrievingand processing the multimedia stream, in addition to thepower consumed by other components for decoding andplaying the audio-video sequence. The multimediastreaming process can be seen as consisting of threedifferent stages: data reception, decoding and playing. Eachof them contributes more or less to the overall batteryconsumption of the mobile device.Various experimental tests (see Table I) were conductedin order to investigate the impact of each of the three stageson the battery power consumption. Different factors specificto each stage were also considered. For testing purposes, avideo clip with a reduced degree of motion content, wasused. Various versions of the studied video clip that havedifferent values of the encoding parameters were created.This set of videos can be classified according to theencoding parameter that is studied. Any two videoscorresponding to a specific encoding parameter havedifferent values for that parameter. The rest of the encodingparameters are constant and their values are in the followinggroups: video compression - H.264 encapsulated in the MP4multimedia container, resolution - 320 x 240 pixels, framerate - 24 fps and average video bit rate - 384 Kbps.Depending on the performed test, each video wasstreamed to the mobile device and/or played locally in aloop until the battery was completely discharged. To182


Streaming andLocal PlaybackTABLE IEXPERIMENTAL TESTS PERFORMEDTest Stages Involved Parameter Category Studied Parameters Parameter’s ValuesReception, Decoding,PlayingEncoding ParametersLocal Playback Decoding, Playing Device SettingsVideo Compression H.264, WMV 9, MPEG4 Part2, H.263Video Resolution 480 x 360, 320 x 240, 240 x 180Video Frame RateAverage Video Bitrate24 fps, 20 fps, 16 fps512 Kbps, 384 Kbps, 192 KbpsVolume Level 100%, 50%, 0%Screen Brightness 100%, 50%CPU Clock Speed520 MHz, Auto Speed, 208 MHzmaximize the accuracy of the results, similar conditionswere kept between any two different tests. The onlyapplications running on the mobile device were the mediaplayer and the program used for tracing information on thebattery power consumption. For the case when a video wasstreamed, the mobile device was maintained in a fixedposition where the wireless signal strength was high andconstant. For the case when the video was played locally theWNIC was switched-off. To ensure that the batterydegradation in time has a minimum impact on the results,the tests corresponding to the same parameter wereperformed consecutively with a minimum time necessary tocharge the battery between them. All the tests wereperformed in a laboratory where the temperature wasconstant.The mobile device consisted of a Dell Axim PDA with anIEEE 802.11b wireless card and a 520 MHz ARMprocessor, running Microsoft Windows Mobile.Additionally, a laptop with 2 GHz Intel Core 2 Duoprocessor and 2 GB of RAM memory, running MicrosoftWindows Vista was used as a streaming server when thevideos were streamed to the mobile device. Additionaldetails on the test setup can be found in [16].A. The Impact of WNIC on the Battery LifeIn the reception stage, most of the power is consumed bythe for network related tasks, such as receiving the datapackets and processing them. In order to assess WNIC’seffect on the battery power consumption, a number ofcomparative tests were performed when the same versionsof the multimedia clip were first streamed from the server tothe mobile device via an Access Point (AP), and then storedand played locally on the PDA.The results presented in Fig. 1.a-d show that when themultimedia clips were streamed, the battery discharged inhalf of the time required when the same clips were playedlocally. Only the times needed to discharge the battery from50% battery charge to 1% battery charge were considered inthe comparison. For example, in the case of the H.264 videowith a frame rate of 16 fps, the battery life was reduced by52%, from 84.22 minutes when the video was played locallyto 40.4 minutes when it was streamed and played in realtime (see Fig. 1.c).The conclusion that can be raised is that WNIC isresponsible for approximately 50% of the total batteryconsumption when retrieving the streamed multimedia clipand the reception stage is where significant power can besaved, by applying different techniques.B. The Impact of Encoding Parameters on Battery LifeBefore it can be played, a multimedia clip has to bedecoded according to the encoding scheme used. CPU andmemory are the main components responsible for the powerconsumption at this stage. In order to assess the effect of theencoding scheme on battery power consumption, followingmultimedia clip parameters were considered: videocompression technique, video resolution, frame rate andbitrate. Comparative results (see Fig. 1.a-d) show that thebattery life can be increased by changing the videocompression technique to a more energy efficient one or byreducing the values of other encoding parameters.Fig. 1.a shows that an increase in battery life of 21.53%can be achieved, by changing the video compressiontechnique from H.264 to H.263 (encapsulated as a Flashvideo), when the videos were played locally on the device.The improvement was even higher when the same videoswere streamed to the mobile device (27.6%).As opposed to the video compression technique, reducingthe quantitative encoding parameters will have a higherimpact on the battery life if the videos are played locally,rather than streaming them. For example, by reducing theresolution from 480 x 360 pixels to 240 x 180 pixels, theframe rate from 24 fps to 16 fps and the average videobitrate from 512 Kbps to 192 Kbps, the battery lifeincreased with 25.66%, 15.31% and 14.67% respectively,when the videos were played locally, and with 15.93%,10.94% and 7.27% respectively, when the videos werestreamed to the PDA (see Fig. 1.b-d).Concluding this, one can say that battery power can besaved by varying the encoding parameters of the multimediaclip being delivered, but the amount of power saved issignificantly smaller than that in the reception stage of themultimedia streaming process.183


Time [min]1009080706050403020100Local Playback vs. Streaming for Different Video CompressionTechniques27.60%30%89.5584.9878.2573.6825%21.53%19.27%20%15.34%46.4743.43 12.72%41.0515%36.4210%6.20%5%0.00%0.00%0%H.263 MPEG4 WMV 9 H.264Battery Life Increase [%]Time [min]9080706050403020100Local Playback vs. Streaming for Different Video Resolutions77.30 25.66%73.0361.5218.72%15.93%38.9336.4233.588.44%0.00% 0.00%240 x 180 320 x 240 480 x 36030%25%20%15%10%5%0%Battery Life Increase [%](a)(b)Local Playback vs. Streaming for Different Video Frame RatesLocal Playback vs. Streaming for Different Video BitratesTime [min]908070605040302010084.2215.31%10.94%77.9873.0340.4037.436.78%36.422.79%0.00% 0.00%16 fps 20 fps 24 fps18%16%14%12%10%8%6%4%2%0%Battery Life Increase [%]Time [min]908070605040302010014.67%79.2273.0369.0838.387.27%36.425.72%35.781.77%0.00% 0.00%192 Kbps 384 Kbps 512 Kbps16%14%12%10%8%6%4%2%0%Battery Life Increase [%](c)(d)Battery Life - Streaming Battery Life - Local Playback Battery Life Increase - Streaming Battery Life Increase - Local PlaybackFig. 1. Battery life and battery life increase (comparing with the less energy efficient case from the group), when videos with different encoding parametersare streamed and played locally on the mobile device: (a) the effect of video compression technique on battery life; (b) the effect of video resolution onbattery life; (c) the effect of video frame rate on battery life; (d) the effect of average video bitrate on battery life.C. The Impact of Device Settings on the Battery LifeDepending on the type of multimedia content, the devicescreen and speakers are the main components responsiblefor the power consumed when the multimedia clip,previously decoded, is played to the user. A new set of testswere carried out in order to assess the effect of devicecomponents and especially their settings on the battery life.Each test consisted of playing locally the same video fileuntil the battery was completely discharged. The WNIC wasswitched-off to eliminate its effect on power consumptionand to increase the contribution of the component whosesettings were changed.To assess the effect of the speakers on the battery life,the volume was set to three different levels and the timeneeded to discharge the battery was measured. The levelsconsidered were 0%, 50% and 100%. Fig. 2.a shows that ifthe media clip that is being played consists in an audiovideosequence, the sound playback by the speaker has asmall contribution to the overall power consumption of themobile device. Turning the sound OFF will increase thebattery life only with 9% comparing with the case when thevolume is set to 100% both in the media player and in theOperating System. Since no important benefit was noticedwhen the volume level was set to 50%, the case was notplotted on the graph.To investigate the effect of the device screen, on thebattery consumption, two levels were considered for thescreen brightness: 100% and 50%. Fig. 2.b shows that incomparison with the sound volume, screen brightness has ahigher impact on power consumption. An increase in batterylife of approximately 31% is achieved when the screenbrightness is reduced to half of its maximum level.A last set of experimental tests was conducted in order toinvestigate the effect of CPU on the battery powerconsumption. For this, the CPU clock speed was set to threedifferent values: 520 MHz, auto speed and 208 MHz. Fig.2.c shows that comparing with the case when the clockspeed is set at 208 MHz, changing this speed to 520 MHzwill increase the battery consumption by 25%, while lettingthe system to choose the optimum speed, reduces the batterylife by 17%.D. Reflections on the ResultsThe experimental results presented above show thatduring the streaming process of a multimedia clip, there aremultiple options for extending the battery life. By turningON the WNIC to retrieve the multimedia content from thenetwork, the battery consumption increases with up to 50%.So is at this stage where significant battery power can besaved. The longer the time that WNIC spends in a lowpower state before retrieving data, the higher the amount of184


Battery Level [%]Battery Level [%]Battery Level [%]60504030201000 1000 2000 3000 4000 5000 6000605040302010Time [sec](a)Volume 100%Volume 0%00 1000 2000 3000 4000 5000 6000605040302010Time [sec](b)Brightness 100%Brightness 50%00 1000 2000 3000 4000 5000 6000Time [sec](c)520 MHzAuto Speed208 MHzFig. 2. Battery life when changing different settings of the mobile device:(a) the sound volume level; (b) the screen brightness; (c) the CPU clockspeed.power that is saved. Solutions for extending this time mustbe found without introducing delays that can impactnegatively the learning process.By changing the encoding parameters of the streamedmultimedia clip, additional power can be saved. There aretwo possible options to do this. First, advance creation ofmultiple versions of the multimedia content which differs interms of encoding parameters. This method requires morestorage space on the server side, but this is not a seriousissue as over the last years, the storage devices haveexponentially increased in capacity and their price hasdrastically dropped. A second option makes use oftranscoding, reducing the required storage space andenabling real time modifications of the encoding parametersto be performed easier. The disadvantage of this solution isthat requires high processing power on the server side,especially when a high number of users are accessing theapplication at the same time.The power saving techniques related to the reception andthe decoding of the multimedia stream can be implementedon the server side, or between the server and the client. Theusers have reduced control on how data is received and onthe amount of resources necessary to decode and play themultimedia content that is being sent. If the e-learningenvironment allows them, they have the option to downloadthe multimedia content and view it afterwards, saving powerat the same time. In this case the reception is controlled bythe users but is efficient only if there is enough networkbandwidth available to allow them to download a clip in amuch shorter time then needed to stream it. On the otherhand, the users have a high level of control on the devicesettings. Changing these settings in order to save power canhave a negative impact on users’ satisfaction.It is worth mentioning that in the case of this particulardevice, the battery has a nonlinear discharge characteristic(see Fig. 2.a-c), but with several sections approximatelylinear (50% to 20%, 20% to 10% and 10% to 1%). Also asignificant interval from the total battery life increase,achieved by changing a specific parameter, corresponds tothe section when the battery life decreases from 10% to 1%.A power saving algorithm must consider the dischargecharacteristic of the battery model corresponding to themobile device that is being used to access the multimediacontent, but also the battery level at which the device is setto turn off, usually between 5% and 10% for laptops orPDAs.IV. PRELIMINARY SUBJECTIVE EVALUATIONThis section presents preliminary results of subjectivetests that have been carried out in order to assess the impactof some of the actions that can be taken to increase thebattery life, on end-user perceived quality of the multimediaclip. A small number of participants have attended andfurther tests are ongoing. The evaluation addressed only theencoding parameters because their variation is easier to becontrolled and is directly reflected in the final quality of themultimedia clip.185


Mean Rating for Different Encoding Parameters1-5 Rating Scale543213.334.003.834.172.174.173.832.673.673.833.003.834.330H.263MPEG4WMV 9H.264240 x 180320 x 240480 x 36016 fps20 fps24 fps192 Kbps384 Kbps512 KbpsVideo Compression Resolution Frame Rate BitrateFig. 3. End-user perceived quality when varying the encoding parameters.A. Evaluation SetupFour sets of short video sequences were created, one foreach of the encoding parameters whose impact on batterylife was previously studied. In particular, each set consistedin a number of three or four video sequences with differentvalues for the parameter associated with that set, andconstant values for the rest of the parameters.To eliminate the influence of other factors, such asfluctuations in the available network bandwidth, on the finalperceived quality, the video sequences were stored andplayed locally on the mobile device. Also, to keep a uniformtesting environment, all the participants used the same PDAdevice for viewing the video sequences. The screenbrightness was set to 100% and the volume level was set atan adequate level for the laboratory environment where thetesting was conducted. None of the participants changed themobile device settings, even if they were allowed to do so.The subjective tests were performed with oneparticipant at a time and before starting, he/she wasintroduced to the test environment and to the method ofassessment. A five-grade quality scale (i.e. 1 - Bad, 5 -Excellent) was used for this purpose. The participant wasasked to rate his/her overall impression given by eachsequence in part, and to mark the corresponding checkboxon a form that was provided. Code names were associatedwith the video sequences so that the viewers were not awareof the parameters being analysed. The overall duration ofthe test session, including introduction, viewing and rating,was planned to last less than 20 minutes. A different orderof displaying the video sequences was chosen for eachparticipant to the subjective test.B. Preliminary ResultsFig. 3 presents the mean scores, achieved by eachparticular sequence. Preliminary results show that, changingthe video compression technique has a lower impact on enduserperceived quality, than decreasing the resolution, theframe rate or the bitrate. Significant battery power can besaved by changing the video compression, whilemaintaining a good quality level. For example by changingthe video compression from H.264 to MPEG4 Part 2, thereis an improvement in battery life of 19.27% when the videois streamed to the mobile device and 15.35% when thevideo is played locally (see Fig. 1.a). At the same time agood level of quality was maintained, the sequence encodedusing MPEG4 scored an average rating of 4 out of 5,whereas the sequence encoded using H.264 scored anaverage of 4.17 out of 5.Results also show that by reducing other encodingparameters to levels that offer a good energy saving, theuser perceived quality is significantly reduced. At the sametime, a slight decrease which maintains a good level ofquality, may not have real benefits in terms of power saving.For example by reducing the frame rate from 24 fps to 16fps, an increase in battery life of 10.94% could be achievedfor the case when the videos are streamed, but the qualitywas significantly reduced, the corresponding videosequence scored an average of 2.67 out of 5. Reducing theframe rate to 20 fps, the quality is still good, but the increasein battery life that can be achieved is only 2.79%.As it can be seen in Fig. 3, the video sequence with aresolution of 320 x 240 pixels, achieved a higher score thanthat with a resolution of 480 x 360 pixels, when the contrarywas expected. This is explained by the fact that devicescreen had a resolution of 320 x 240 pixels and additionaltasks were required for scaling down the video, negativelyinfluencing the user perceived quality.If e-learning users are in the middle of a learningactivity and an interruption occurs due to insufficient batterypower, their QoE can be negatively impacted. This can alsohappen if the quality of the multimedia content is reducedtoo much, even if by doing this sufficient power is saved toallow them to complete the learning activity. To maximisethe learning outcome, a power saving solution for e-learningenvironments must find the right balance between theamount of power saved and the user QoE.186


V. CONCLUSION AND FUTURE WORKThe goal of this paper was to assess the factors behind thebattery power consumption when multimedia content isstreamed and played on a mobile device. A number ofexperimental tests were carried out, which have shown thatduring the multimedia streaming process, data receptionaccounts for half of the total power consumption of themobile device. Results have also shown that battery powerconsumption is influenced by the encoding parameters ofthe multimedia clip and by the settings of the mobile device.However, their contribution is found to be significantlysmaller than that of WNIC.Preliminary subjective testing was conducted in order toassess the impact of encoding related to the power savingactions on the end-user perceived quality. The conclusiondrawn from these tests was that changing the encodingparameters, while at the same time maintaining a goodquality, may not save enough power to improve the learningprocess. Therefore most of the effort must be concentratedto save power in the reception stage of the multimediastreaming. Encoding related techniques should be usedwhen the power saved in the reception stage is still notenough to maximise the learning outcome.Future work will address the deployment of a power savemechanism that will incorporate various techniques specificto different stages of the multimedia streaming process.Further experimental testing will be conducted on variousmobile devices, in order to propose a battery independentdischarge model that estimates with accuracy the remainingbattery life.Elaborate subjective testing will also be conducted on alarge number of participants in order to assess the benefitsof the proposed solution in terms of battery power save,end-user satisfaction and learning improvement.Computing and Ambient Intelligence: The Challenge of Multimedia,Dagstuhl Seminar <strong>Proceedings</strong>, 2005.[6] F. Meawad and G. Stubbs, “A framework for enabling on-demandpersonalised mobile learning,” International Journal of MobileLearning and Organisation, vol. 2, 2008, pp. 133-148.[7] S.A. Petersen and J.K. Markiewicz, “PALLAS: PersonalisedLanguage Learning on Mobile Devices,” Wireless, Mobile, andUbiquitous Technology in Education,. WMUTE 2008. Fifth IEEEInternational Conference on, 2008, pp. 52-59.[8] C.H. Muntean, “Improving Learner Quality of Experience by ContentAdaptation based on Network Conditions,” Computers in HumanBehavior Journal, Special issue on "Integration of Human Factors inNetworked Computing”, vol. 24, 2008, pp. 1452-1472.[9] C.H. Muntean and G.M. Muntean, “End-User Quality of ExperienceAware Personalised E-Learning,” Architecture Solutions for E-Learning Systems, C. Pahl, ed., IGI Global, 2008, pp. 154-174.[10] F. Zhang, S. Chanson, “Proxy-assisted scheduling for energyefficientmultimedia streaming over wireless LAN”, 4th Int. IFIP-TC6 Networking Conference, Lecture Notes in Computer Science,Vol. 3462, 2005, pp. 980-991.[11] M. Tamai, T. Sun, K. Yasumoto, N. Shibata, and M. Ito, “Energyawarevideo streaming with QoS control for portable computingdevices,” <strong>Proceedings</strong> of the 14th international workshop on Networkand operating systems support for digital audio and video, Cork,Ireland: ACM, 2004, pp. 68-73.[12] J. Adams and G.M. Muntean, “Adaptive-Buffer Power SaveMechanism for Mobile Multimedia Streaming,” Communications,ICC '07. IEEE International Conference on, 2007, pp. 4548-4553.[13] Z. Lu, J. Lach, M. Stan, and K. Skadron, “Reducing multimediadecode power using feedback control,” Computer Design,<strong>Proceedings</strong>. 21st International Conference on, 2003, pp. 489-496.[14] S. Pasricha, S. Mohapatra, M. Luthra, N. Dutt, and N.Venkatasubramanian, “Reducing backlight power consumption forstreaming video applications on mobile handheld devices”, In Proc.First Workshop Embedded Systems for Real-Time Multimedia, pp.11-17, 2003[15] H. Shim, N. Chang, and M. Pedram, “A backlight power managementframework for battery-operated multimedia systems,” IEEE Designand Test of Computers, vol. 21, no. 5, 2004, pp. 388 – 396.[16] A.N. Moldovan and C.H. Muntean, “Personalisation of themultimedia content delivered to mobile device users,” BroadbandMultimedia Systems and Broadcasting, BMSB '09. IEEE InternationalSymposium on, <strong>2009</strong>, pp. 1-6.ACKNOWLEDGMENTThe support of Science Foundation Ireland is gratefullyacknowledged.REFERENCES[1] G. Salaway, J.B. Caruso, and M.R. Nelson, “ECAR Study ofUndergraduate Students and Information Technology, 2008,”EDUCAUSE Center for Applied Research, vol. 8, Oct. 2008, p. 10.[2] R. Clifford, “Adaptive Hypermedia for Music Instruction”, 7thInternational Technological Directions in Music LearningConference, 2000.[3] P. De Bra, A. Aerts, B. Berden, B. de Lange, B. Rousseau, T. Santic,D. Smits, and N. Stash, “AHA! The adaptive hypermediaarchitecture,” <strong>Proceedings</strong> of the fourteenth ACM conference onHypertext and hypermedia, ACM Press New York, NY, USA, 2003,pp. 81-84.[4] S.Y. Chen and R.D. Macredie, “Cognitive styles and hypermedianavigation: Development of a learning model,” Journal of theAmerican Society for Information Science and Technology, vol. 53,2002, pp. 3-15.[5] B. Bomsdorf, “Adaptation of Learning Spaces: SupportingUbiquitous Learning in Higher Distance Education,” Mobile187


Billing Issues when Accessing Personalised Educational ContentAndreea Molnar, Cristina Hava MunteanNational College of Ireland, School of Computing, Mayor Street, Dublin 1, Irelandamolnar@student.ncirl.ie, cmuntean@ncirl.ieAbstractThe increased affordability of mobile devicescombined with the availability of the latest wirelesstechnologies have made mobile devices an attractivetool for learning. Nowadays learners can choosebetween multiple wireless networks with differentcharacteristics, belonging to the same or to differentmobile operators. Unfortunately, the Internet billingplans are still difficult to predict and control by mostusers. This paper presents an algorithm which aims todetermine the best network, from a list of availableones (in terms of price), for delivering the selectededucational content.1. IntroductionMobile device popularity has increasedtremendously in the last years. For example, more thanhalf of the world population has a mobile phone [1].Mobile devices are present everywhere and they havebecome more and more accessible. Their pricesdropped, their portability has increased and theperformance offered by mobile networks has improveda lot. Due to their pervasive presence as well as to thetremendous development of new features andcapabilities they have become an attractive tool foreducation.Owning a mobile device that has connectivity to oneor more wireless networks makes the access toeducational content easy at any time and fromanywhere. Mobile devices ease the learner’s access toinformation, helping them to have access to the rightresource at the right time. In the same time, they areparticularly useful for learners that do not have the timeto plan a learning session and they are usually studyingin unplanned situations. Difficulties due to timeconstraints have been observed especially for part timestudents. Becking et al. [2] noted that thesephenomena, giving examples of unpredictablesituations where learners could benefit from havingaccess to the educational content. Among the examplesgiven is the one of a salesman who travels a lot andmay learn while s/he is on train. Another example is ofa mother who is waiting for her turn in the doctorwaiting room.Even though mobile devices offer new opportunitiesfor learning, they have some restrictions: small screensize, limited number of buttons, battery life limitationsetc. Therefore, offering guidance to the learners andproviding them with the adequate educational contentsuited to their needs is an important issue addressed bylearning systems. Adaptive e-learning systems offersolutions to these problems, by providing guidance andpersonalised material suitable to the learner. Differentuser’s characteristics have been taken intoconsideration in the adaptation process such as:knowledge [3], goal [4], learning styles [5],prerequisites and experience [6], network performance[7], etc. Lately, learner device characteristics were alsoconsidered in the personalisation process [8, 9, 10].However, to the best of our knowledge, none ofthese e-learning systems have considered that thelearner may choose between different networks, whenthe mobile device offers access to more than onewireless network. For example a number of mobiledevices that include these features are listed below:• PDA O2 XDA Zinc has access to 3G, WiFiand GPRS;• HTC TyTN II has access to HSDPA/UMTS,WiFi, GSM, EDGE and GPRS;• HTC P3300 has access to GSM/GPRS/EDGEand WiFi;• Mobile Pocket PC-i-mate Jasjar has access toGPRS, WiFi, etcEach type of wireless network may have both differentdelivery performance and billing plans. Cheaperalternatives may trigger the learner to switch manuallybetween the networks.This paper presents a Performance Aware and CostOriented e-Learning Framework (PACO-eLF) thatsupports content personalisation by taking into account188


Figure 1 Multiple network selectionlearner’s profile, the device used, the networkcharacteristics and the cost they have to pay foraccessing the content. An algorithm that determines thebest network, from a list of available ones (in terms ofprice), for delivering the personalised educationalcontent is also described.The rest of the paper is organised as follows.Section 2 presents an overview of the existing billingmodels for Internet access through mobile devices.Section 3 briefly introduces the PACO-eLF andpresents the network selection algorithm. Section 4presents the conclusions we arrived so far anddescribes new directions to continue our research work.2. Billing models for Internet accessthrough mobile devicesAccess to multiple networks offers to the learnermore possibilities of retrieving the educational content,by choosing a given mobile service (Figure 1). Ideallythe learner selects from the available networks, the onewhich offers him the best price and performance, or atleast the best trade off between them. Unfortunatelythis is not always the case. Mobile data billing systemsare still difficult to understand by most users anddetermining the best network in terms of performanceoften requires engineering knowledge. This problembecomes even more important in the context of thewireless channel where network resources are limited.The diversity of billing schemes that currently existson the market (Table 1) does not help the learner inmaking a decision. The most common data billing planin mobile communication is the flat rate bundle [11].Other billing plans include, but are not limited to:• Time based billing (paying for the amount oftime that is spent using the Internet, forexample 0.005€/minute)• Data based billing (paying for the amount ofTable 1 Mobile billing plans diversityOperator Time billing Data billing Bundle billing Monthly flat rate Other servicesincludedT-mobile(USA) X X XMeteor(Ireland)XThree(Ireland)XVodafone(Ireland)XO2(Ireland)XIndosat(Indonesia) X X X XMobility(Saudi Arabia) X X XVodacom(South Africa) X X189


data consumed, for example 0.2€/MB)• Monthly flat rate (unlimited Internet access,paid monthly)• Free Internet accessSometimes billing plans are much more complex.Depending on the carrier policy, a user can pay moreby visiting some websites or not pay at all (e.g. whenvisiting the carrier portal). The price may also dependon the connection speed or on the time of the day whenthe Internet is accessed. Most of the time, mobile datatraffic is charged separately from other services such asSMS (Short Message Service), MMS (MultimediaMessaging Service), calls etc. Each particular servicemay have a different cost associated. However there aretimes when they are included in one single package.Time based billing is relatively easy to understand,but the attention of the user is on optimising his actionsin order to spend less time on the Internet whenretrieving the actual information s/he is requesting [12].Data based billing, is not so easy understandable forthe users. Most of them do not know how to predict theamount of data downloaded. Sometimes there is noway to find how much the users spend. Even when themobile operator facilitates this process by providing acounter for the data used so far, some of the users donot know that it exists or how to use it [12].The data bundle has the advantage of offering dataat low price. However when the user passes the datalimit the price becomes quite high. Most of the users donot realise when they exhaust the amount of dataavailable in the bundle. This leads to high bills,discouraging users to access the Internet through theirmobile devices [12, 13].Choosing the right network in terms of price andcontrolling the cost may distract the learner’s attentionfrom the educational content he is presented with.Therefore, there is a need for an automatic mechanismthat assesses the billing plan. This paper presents analgorithm that aims to help the learners in selecting theright network in term of cost.3. Cost oriented adaptive e-LearningsystemPACO-eLF (Performance Aware and Cost Orientede-Learning Framework) (Figure 2) aims at offeringadaptive educational content to learners by taking intoaccount their profile, the device they are using, thenetwork performance and the cost they have to pay foraccessing the content. The classical architecture of anadaptive e-learning system that consists of UM (UserModel), DM (Domain Model) and AM (AdaptationModel) has been extended by adding the PM(Performance Model) and CM (Cost Model).PACO-eLF consists of a Client Application and aServer Application. The Client Application maintainsthe CM. It stores the billing plans for every network theFigure 2 PACO-eLF190


learner device has access to. It also estimates the pricefor each available network when a document isrequired. It interacts with the Sever Application to getinformation with respect to what changes occur in thecharacteristics of the network currently in use that mayaffect the price that will be paid for the data retrieval. Ifthe total price increases over a certain thresholdimposed by the learner, the learner is prompted andprovided with an alternative network to be used thatwould offer a better price for retrieving the educationalcontent.The Server Application maintains information aboutlearner profile (UM), available courses (DM), networkperformance (PM) and adaptation rules that describehow to personalise the educational content based on theuser profile and the network conditions (AM). Therules are interpreted by the Adaptation Engine (AE).User Model (UM) holds the learner profile. Itconsists of:• demographic information (e.g. address)• personal data (e.g. name, password, etc)• learner preferences• learner goals• knowledge about the concepts contained in theDM• device characteristics• how much the learner is willing to pay inorder to retrieve educational content, etc.Device characteristics considered by the UM are:• Screen size: width and height in pixels;• Screen colour depth: bits/pixel;• Screen mode: it refers to whether the screenhas portrait or landscape mode and if itsupports switching between the two modes;• Capabilities: whether the device is capable ofdisplaying video, audio, images, etc;• Supported mark-up or scripting language:e.g. not all mobile devices support allJavaScript functions;• Memory: capacity.Device Model (DM) stores and organises theeducational content, divided into fragments betweenwhich relationships exist. For example a link: indicatesthat between two fragments navigation can be done anda prerequisite relationship indicates that there is anorder in which the fragments should be delivered to thelearner (e.g. a learner should not read about a certainconcept if s/he has no knowledge or if s/he did not readfirst about the prerequisite concept). The educationalcontent fragments can be grouped together based onthese relationships in order to form complex concepts.Performance Model (PM) contains informationabout the performance of the different networks thatthe learner has access to. For every enabled networkthe device has, performance characteristics aremaintained and continuously monitored, in order todetermine the quality of the transmitted content. It alsoprovides suggestions on how the educational contentshould be adopted so that it is suitable for transmissionover the active network.The Adaptation Model (AM) holds the adaptationrules based on which the content selection andpersonalisation is done. The rules combine informationon learner profile, device, network conditions and cost.Adaptation Engine (AE) interprets the rules fromthe AM and selects the most suitable educationalcontent.The Cost Model (CM) maintains the learner billingplans. It also has the role of suggesting to the learnerthe best network to be used in terms of cost andperformance in order to assure that the thresholdimposed by the learner is not surpassed. The basednetwork to be used is determined by an algorithm thatis presented in the next section.4. Cost oriented network selectionalgorithmThe main goal of the algorithm is to determine thebest network, from a list of available ones on thedevice (in terms of price), for delivering thepersonalised educational content.We consider that the learner has one or more mobilenetwork operators and s/he may have one or morebilling plans currently in use. The plan types the learnermay have are: free Internet access, flat free billing, databundle billing, data based billing and/or time basedbilling.For the data bundle billing the quantity ofinformation contained in the bundle is usually availablefor a specific period of time. Sometimes, the mobiledata operator does not allow a new bundle to beacquired if the learner has a bundle in use for thecurrent time period. For example, if the learner choosesa data bundle over a period of 30 days which contains500 MB of data, s/he may not choose another databundle billing plan if the 30 days period has notexpired. This leads to the situation in which the otherplans are unavailable for the user. Therefore, theyshould not be considered by the algorithm when thelearner is provided with the cheapest alternative toaccess the educational content. An algorithm forselecting just the available plans for everyoperator/network the learner has access to is presentedin Figure 3. It takes as input all the available operators191


and returns billing plans the learner has currentlyaccess to.Based on the selected plans, on the lecture size andon the network connection speed, the price foraccessing the educational content is computed (FigureFigure3 Operators Plans Selection4a-d). The lecture size is provided by the AE(Adaptation Engine) after selecting the educationalmaterial suitable to the learner profile. The connectionspeed for each of the available networks is provided bythe PM (Performance Model).Figure 4a Cheapest algorithm selection3192


Figure 4b Estimated price for time based planFigure 4c Estimated price for volume based planThe estimated price the learner has to pay may benull in three cases:• the learner has access to a free network• the learner has a flat free plan• the learner has a bundle data plan already inuse and the remaining size of the bundle isless than the total size of the requesteddocument.Otherwise:• if the learner has a data bundle plan already inuse and the size of the lecture exceeds theremaining quantity of data from the bundle,the price for the quantity of information whichFigure 4d Price estimate for data based bundle planexceeds the limit. This is calculated and that isconsidered the estimated price the learner hasto pay (Figure 4d).• if a data bundle plan is available but not inuse, the price will include also the price of thebundle (Figure 4d).• if the learner has a time based plan (Figure4b), the estimated price will be computedbased on the price per time and the averagenetwork speed.• if the learner has a data volume based plan theestimated price is computed based on the193


lecture size and the price per quantity ofinformation(Figure 4c).Having all these prices computed, a ranking can bemade based on the amount of money the learner has topay. Two other cases are taken into account when thedata bundle based plans are classified: the expiringdate for the data bundle or the quantity of informationcontained in a bundle. The first case is useful whenthere are two bundles already in use that have the sameprice. Probably most learners would choose to use thebundle that is going to expire first. For example if thereare two bundles in use, first having a remaining databundle of 500Mb and expires next day and the secondone has 1Gb and expires in a week and the lecture sizeis less than 500Mb, the first network will be displayedfor the learner as the first option. The second case iswhen the bundles are not in use yet, but two mobileoperators offer at the same price data bundles withdifferent limits on the quantity of information to betransferred. In this case the plan which has the biggerquantity of information in the bundle may be chosen.For example if an operator offers a 1Gb data bundle for15 Euros whereas the second one offers 5Gb for thesame price, the most advantageous for the learnerwould be the plan offered by the second operator.After ranking the plans the top three plans in termsof cost are displayed to the learner and s/he will chooseamong them. However, the learner has the option to seethe other plans, if s/he wishes to do so.5. Conclusions and further workThis paper presented and discussed various billingplans that currently exist on the market for accessingthe Internet. PACO-eLF – an adaptive e-learningframework was briefly presented and an cost orientednetwork selection algorithm was described in details.The algorithm assesses the billing plans of the activenetworks on the learner device and computes the pricewhen downloading a given document, for eachnetwork. It provides the learner with informationrelated to how much s/he needs to pay when theeducational content is retrieved.We are currently working on an improved version ofthe algorithm that provides a better estimation on theprice the learner has to pay. We achieve this by takinginto account also other messages/information sent overthe network that are not included in the lecture size.The algorithm will also take into account networkconditions, in order to provide the learner with the bestnetwork alternative over which the educational contentcan be sent. The PACO-eLF framework is currentlyunder implementation. Tests will be performed to seethe effects of the algorithm on the learner QoE (Qualityof Experience). The results will be presented in anotherpaper.ACKNOWLEDGEMENTThis research work is supported by IRCSETEmbark Postgraduate Scholarship Scheme and SFIResearch Frontiers Programme.References[1]C. Shuler, “Pockets of Potential: Using MobileTechnologies to Promote Children’s Learning”, The JoanGanz Cooney Center at Seasame Workshop, New York,USA, <strong>2009</strong> Retrieved April 10, <strong>2009</strong>, fromhttp://joanganzcooneycenter.org/pdf/pockets_of_potential.pdf[2] D. Becking, S. Betermieux, B. Bomsdorf, F. Birgit, E.Heuel, P. Langer and G. Schlageter, „Didactic Profiling:Supporting the Mobile Learner” In G. Richards (Ed.),<strong>Proceedings</strong> of World Conference on E-Learning inCorporate, Government, Healthcare, and Higher Education,2004, pp. 1760-1767.[3]M. Yudelson, O. Medvedeva and R. Crowley, "Amultifactor approach to student model evaluation," UserModeling and User-Adapted Interaction, 2008, 18(4), pp.349-382.[4]P. Karampiperis and D. Sampson, "Adaptive learningresources sequencing in educational hypermedia systems",Educational Technology & Society, 2005, 8(4), pp. 128-147.[5]E. Brown, T. Brailsford, T. Fisher, A. Moore and H.Ashman, "Reappraising cognitive styles in adaptive webapplications", <strong>Proceedings</strong> of the 15th Internationalconference on World Wide Web, ACM Press, New York,USA, 2006, pp. 327-335.[6]P. De Bra, D. Smits and N. Stash, “Creating andDelivering Adaptive Courses with AHA!”, <strong>Proceedings</strong> ofthe first European Conference on Technology EnhancedLearning, Springer LNCS 4227, Crete, October 2006, pp.21-33.[7]C. H. Muntean, “Improving learner quality of experienceby content adaptation based on network conditions”,Computers in Human Behavior, 2008, 24(4), pp. 1452-1472.[8]A. Brady and O. Conlan, V. Wade, “DynamicComposition and Personalisation of PDA-based eLearning –Personalized mLearning”, E-Learn’04, World Conference onE-Learning in Corporate, Government, Healthcare andHigher Education, Washington, D.C, 2004, pp. 234-242.[9]M. Á. C. González, M. J. C. Guerrero, M. A. Forment andF. J. G. Peñalvo, “Back and Forth: From the LMS to the194


Mobile Mobile Device. A SOA Approach” <strong>Proceedings</strong>Mobile Learning Conference <strong>2009</strong>, IADIS Press, <strong>2009</strong>, pp.114-120.[10]D. Keegan, “Mobile Learning: The Next Generation ofLearning Distance Education International”, 2005, RetrievedApril 30, <strong>2009</strong>, fromhttp://learning.ericsson.net/mlearning2/files/workpackage5/book.doc[11]Telecoms Pricing, “Mobile Broadband Pricing Survey<strong>2009</strong>”, 2008, Retrieved April 10, <strong>2009</strong>, fromhttp://www.telecomspricing.com/product.cfm?ds=telecomspricing_content&prod=311&dept=304[12]V. Roto, R. Geisler, A. Kaikkonen, A. Popescu and E.Vartiainen, “Data Traffic Costs and Mobile Browsing UserExperience”, MobEA IV workshop on Empowering theMobile Web, in conjunction with WWW2006 conference,2006, Retrieved April 10, <strong>2009</strong>, fromhttp://www.research.att.com/~rjana/MobEA-IV/PAPERS/MobEA_IV-Paper_7.pdf[13]P. Isomursu, R. Hinman, M. Isomursu and M.Spasojevic, “Metaphors for the Mobile Internet”, Journal onKnowledge, Technology & Policy, 2007, 20(4), pp. 259-26.195


Section 5ARADIO SYSTEMS 2196


A Blind Detection Method of Non-Periodic DSSSSignals at Lower SNRJunjie PuSchool of TelecommunicationHangzhou Dianzi UniversityHangzhou, 310018, Chinagigibest@gmail.comZhijin ZhaoSchool of TelecommunicationHangzhou Dianzi UniversityHangzhou, 310018, Chinazhaozj03@hdu.edu.cnAbstract—Because detecting non-periodic direct spread spectrumsequence (DSSS) signals can not make use of periodicity andrelevance directly, and the signals always flood in thebackground noise, the blind detection and parameter estimationis even more difficult. Since the fourth-order moment chip ofnon-periodic DSSS signals contains the carrier frequencyinformation and has good performance in depressing Gaussianwhite noise, the detection method based on the quadratic fourthordermoment chip of non-periodic DSSS signals is proposed.Experimental results show that the detection performance of theproposed method is better than that of the double frequencymethod, spectrum-reprocessing method and cepstrum method.When the false alarm probability is 1%, the detection probabilityof the method reaches above 90% at SNR -20 dB.Keywords- Non-periodic DSSS signal; fourth-order momentchip; signal detection; lower SNRI. INTRODUCTIONLow probability of interception (LPI) signal detection andparameter estimation has been a hot and difficult research athome and abroad. Since the DSSS signal has been widely usedin military communications, satellite communications satellitenavigation systems and many other systems because of itssecrecy capacity, concealment and anti-interference ability.DSSS signal can be divided into periodic signals and nonperiodicsignals. A non-periodic DSSS signal also known aslong-code signals, which means a spread-spectrum sequencescycle including a number of information symbol cycle. That is,different information symbol code corresponds to differentspread-spectrum PN code. Typical applications include JTIDSsignal, GPS-P(Y) code signal and so on. WCDMA signals arenon-periodic signals.According to a lot of the current literature, the key of DSSSsignal blind detection is generally focused on how to use thecharacteristics of PN code and the signal correlation, and alsouse the method of accumulation to depress the noise to achievethe detection and parameter estimation. Generally, DSSS signalis assumed a periodic signal, and some good test results at alow SNR are obtained. However, when the received signal isnon-periodic DSSS signal, the blind detection and parameterestimation becomes very difficult at a low SNR. Because nonperiodicDSSS signal destroys the characteristics of PN codecycle and its correlation, the signal detection performancedegrades. The lower SNR means that signal is very weak andthe signal spectrum is entirely submerged in Gaussian whitenoise spectrum, which brings larger error to the detection anddata processing. Energy method [1], double frequency method[2], conventional multiplication delay detection method [3],cumulative method [4], cepstrum method [5] and spectralcorrelation method [6~8] can detect non-periodic DSSS signalsuccessfully only above the SNR from -8dB. Therefore thenon-periodic DSSS signal detection and estimation at a muchlower SNR is a problem to be studied in depth.Because the high-order statistics contains a rich informationon the characteristics of signal, basing on the work of the paper[9~10], a blind detection method of non-periodic DSSS signalsusing the quadratic fourth-order moment chip at lower SNR isproposed in this paper. The detection probability of this methodis above 90% at SNR -20dB, and this method doesn’t involveextensive matrix operations.II.SIGNAL MODEL AND THEORETICAL ANALYSISNon-periodic DSSS signal (WCDMA signal) is defined as:K( t)= {Ak [ dki(t −tk)Gkck( t −tk) Ski(t −tk)k=1s− dkq+ A [ dk( t − tk)Gkck( t − tk) Skq( t − tk)]cos(2 fct + ϕk)ki( t −tk)G ckk( t −tk) Skq( t −t+kq( k k k k ki kc kd t −t)G c ( t −t) S ( t −t)]sin(2f t + ϕ )} (1)wherekAkis the kth user’s signal amplitude;d t − are the kth user’s the odd-bit anddki( t − t k) kq( t k)evenl-bit data; ( t t ) ∈{ 1,−1}c k− kis OVSF code; the channelcode in the same state varies with each other;t S t − are the real and imaginary parts ofS ( − ), ( )kit kkqt kthe complex scrambling code. They are independent, and thescrambling code between the different areas is alsoindependent. f is carrier frequency; ϕ is a random phasecdistributed in (0,2) evenly. tkis the kth user’s random delay)k197


and is uniformly distributed in the range of [ 0, T ] ; T is thewidth of the symbol.The two hypotheses is:( t) = s( t) n( t)( t) n( t)H1: x +(2)H0: x =(3)n is Gaussianwhite noise with zero mean, and they are independent.where s ( t)is non-periodic DSSS signal , ( t)Under the hypothesis of H0, the fourth-order moment ofx(t) is:m4( τ1, τ2,τ3) = σ nδ( τ1) δ( τ3−τ2) + δ( τ2) δ( τ3−τ1)δ ( τ ) δ ( τ − )]4 x[+ (4)3 2τ1One of the fourth-order moment chip is:m (0,0, ) 04 xτ = , τ ≠ 0(5)Under the hypothesis of H1, one of the fourth-ordermoment of x(t) is:mK4 x0 = k 4d4d+k=14( ,0, τ) { A [ m ( 0,0,2 τ) + m ( 0,1,2 τ 1)12( , −1,2τ−1) + m ( 1,1,2 τ ) + m ( −1,−1,2)]+ m4d04d4dτ21+2+ 2σKKK Ak = 1 m≠kK2n m ≠ k2k2kA2mdRd( 0) R ( 2τ )d12( 2τ)}R ( τ ) R ( τ ) cos ( πfτ )A R2c( 0,0,2τ+ 1) − 2m( 0,0,2τ−1)4+ { Ak[2m4d4dk=1414d−( ,1,2 τ + 1) − ( −1,−1,21)]+ m dm τKK2 2+ 6AkAmRk= 1 m≠kKd( 0 )[R ( 2τ+ 1) − R ( 2τ−1)]d( 2τ+ 1) − R ( 2τ−1) R ( τ )2+ σn A 2k[ Rdd]}Sm≠k( τ ) ( πfτ )R sin 2(6)In Eq.(6), ≠ 0cτ , ( ) E c( t) c( )]( τ ) = E[ d( t) d( t + τ )]R d,( τ ) = E[ S( t) S( t + τ )]R S,SR cτ = [ t + τ ,m ( τ , τ τ ) is the fourth-order moment of ( t)4 d 1 2,3dccd .In Eq.(6), the carrier frequency component is contained inthe fourth-order moment chip, the existence of non-periodicDSSS signal can be detected through the analysis of receivedsignals in frequency domain. The figure 1 is the spectrum ofm4x( 0,0,τ ), when the carrier frequency is 10MHz and SNR is -5dB. Form the figure 1, m4x( 0,0,τ ) has a clear peak at thecarrier frequency. Hence, the fourth -order moment chip can beused as a detection statistics to detect the non-periodic DSSSsignal.margin20001500100050000 0.5 1 1.5 2frequency/Hzx 10 8Figure 1. The spectrum of ( 0,0,τ )A. Detecting rulesIII.m at SNR=-5dB.4xDETECTION METHODAccording to Eq.(5) and (6), under the hypothesis of H ,1the m4x( 0,0,τ ) of the mixed-signal noise is not equal to zero;under the hypothesis of H , the m ( 0,0,τ ) of the Gaussian04xwhite noise is equal to zero, which means m4x( 0,0,τ ) cansuppress Gaussian white noise well except the point τ = 0 . Atthis point τ = 0 , there is information about noise. So m4x( 0,0,τ )at the pointτ = 0 is removed in order to reduce the impact ofnoise. In practice, since the estimation error of higher-orderstatistics will be increased with the number of the orderincreasing, higher-order statistics can not suppress Gaussianwhite noise as well as expected. In order to solve the problem,a new detection method of non-periodic DSSS signal based onthe quadratic fourth-order moment chip (QFMC) is proposed,which is described as follows:(1) The fourth-order moment chip of non-periodic DSSSsignal is estimated as m4x( 0,0,τ ).(2) Removing m4x( 0,0,τ ) at the point τ = 0 , the otherm4x( 0,0,τ ) is denoted y (t).(3) The fourth-order moment chip of y (t)is computed andFFT 0,0,τ .( )is transformed into frequency domain as m 4 y( )(4) If FFT m ( 0,0,τ ) > T(y)D, the non-periodic DSSS4signal is existed; instead, the signal does not exist.The detector structure of this method is shown in Figure 2.198


( t) = s( t)n( t)mˆ 4x( 0,0,τ )x +≤ T D, H 0FFT ( mˆ4( 0,0, τ ))yτ = 0302520> T D, H 1Figure 2. The blind detector structure of non-peridic DSSS signal based onthe quadratic fourth-order moment chip.m is4xtoo heavy, Eq.(7) is used to reduce the amount of calculation.In the actual calculations, we approximate estimation ofm4x( 0,0,τ ) by omitting the computation of limit andexpectation in Eq.(7).mBecause the load of direct computation for ( 0,0,τ )N4 x ]N→∞2N+ 1 t=−NN3 − j 2πft− j2πft*( 0,0, τ ) = IFT{limE{[ x( t) et = − N1( t ) e }} x(7)B. Simulation and AnalysisIn the simulation, chip rate is 2Mbps; information rate is10kbps; the carrier frequency is 10MHz; A/D samplingfrequency is 200MHz, sample number N in one carrier periodis 20. The spreading factor of Walsh code is 64, and the lengthof scrambling code is 2047. Additive noise is Gaussian whitenoise. A scrambling code contains a maximum of 32information code. Setting the false alarm probability 1%, andthe detection threshold T can be determined.Dm4xisshown in figure 3. As the noise increases, the spectrum peak ofsignals is not obvious. In theory, the fourth-order moment chipof mixed-signal can effectively suppress Gaussian white noise,but since the estimation error and the noise is not strictlyGaussian white noise in the simulation, noise can not becompletely suppressed. When the SNR is -25dB, the spectrumm 0,0,τ is shown in figure 4. In this figure, noise hasWhen the SNR is -25dB, the spectrum of ( 0,0,τ )of ( )4 ybeen suppressed and the spectrum peak of the non-periodicDSSS signals is obvious. So the proposed method in this papernot only can effectively reduce both Gaussian white noise andestimation error, but also can improve the detectionperformance.If the user number is 2, 4, 8, respectively, the detectionperformance of proposed QFMC is shown in Figure 5. We cansee that with the increase of the number of users K, theprobability of detection increases. According to Eq.6, ifthe number of users increases, the value of the fourth-ordermoment chip will increase. Thus, the simulation results areconsistent with the theory analysis.marginprobability of detectingmargin1510500 0.5 1 1.5 2frequency/Hzx 10 8Figure 3. The spectrum of fourth-order moment chip of mixednoisesignal at SNR=-25dB4200 0.5 1 1.5 26 x 10-4 frequency/Hzx 10 8Figure 4. The spectrum of quadratic fourth-order moment chip ofmixed-noise signal at SNR=-25dB10.80.60.4K=20.2K=4K=80-30 -25 -20 -15SNR/dBFigure 5. The detection of non-peridic DSSS signal with differentnumber of users199


When the number of users is 2, and the detectionperformances of the double frequency method (DFM), thespectrum-reprocessing method (SRM), cepstrum method (CM)and proposed method QFMC are shown in Figure 6. From theFigure 6, it can be seen that the detection probability of QFMCis above 90% at SNR -20dB. The detection performance ofQFMC is much better than that of other methods. The SNRimprovement of QFMC method is above 9dB when thedetection probability is above 90%.probability of detecting10.80.60.40.2QFMCDFMSRMCM0-25 -20 -15 -10 -5SNR/dBFigure 6The blind detection of non-peridic DSSS signal with methods.IV.CONCLUSIONSince the fourth-order moment chip of non-periodic DSSSsignals has good performance in depressing Gaussian whitenoise, the detection method based on the quadratic fourth-ordermoment chip of non-periodic DSSS signals is proposed. Thesimulation results show this method’ detection probabilitiescould reach above 90% at SNR -20dB.REFERENCES[1] Urkowitz, H. Energy Detection of Unknown Deterministic. Proceedingof IEEE. 1967. 55(4). 523-531.[2] Hill, D.A. and Bodie, J.B. Carrier Detection of PSK Signals. IEEE Transon Commun. 2001. 49(3).487-496.[3] Yuan Liang, Liu Jinan and Wen Zhijin. A Detection Method For DS/SSSignal in the Low SNR Condition. Modern Electronics Technique.2005.196(5). 50-51.[4] Reed, D.E. and Wickert, M.A. Spread spectrum signals with lowprobability of chip rate detection. Selected Areas in Communications,IEEE Journal on. 1989. 7(4).595-601.[5] Gardner, W.A. and Spooner, C.M.(1992). Signal Interception:Performance Advantages of Cyclic-Feature Detectors. IEEE Trans onCommu.40(1).149-149.[6] Gardner, W.A. and Spooner, C.M.(1992). Detection and SourceLocation of Weak Cyclostationary Signals: Simplifications of theMaximum Likelihood Receiver. IEEE Trans on Commu.41(6).905-916.[7] Zhang Tianqi, Zhou Zhengzhong. A new spectral Method of PeriodicDetection of PC sequence in lower SNR DS/SS signals. Chinese Journalof Radio Science. 2001. 16(4).518-521.[8] Zhang Jianli. DS Signal Detection. Radio Engineering of China. 1994.24(2).19-21[9] Zhao Zhijin, Wu Jia, Xu Chunyun. The Study on the Detection Methodsof DSSS/QPSK Signal Based on the Fourth-order Cumulants. ActaElectronica Sinica. 2007. 35(6).1046-1049.[10] Wu Jia, Zhao Zhijin, Shang Junna, Kong Xianzheng. Detection andMultiple Parameter Estima tion Methods for Direct Sequence SpreadSpectrum Signal at Lower SNR. Computer Simulation. 2008.25(2).153-156.200


Power Consumption Analysis of Bluetooth in SniffModeJiangchuan WenElectronic and Computer Engineering DepartmentUniversity of LimerickLimerick, IrelandJiangchuan.Wen@ul.ieJohn NelsonElectronic and Computer Engineering DepartmentUniversity of LimerickLimerick, IrelandJohn.Nelson@ul.ieAbstract—Bluetooth (BT) is a short-range wirelesscommunications system and the key features of BT arerobustness, low power, and low cost. In this paper, we focus onthe BT power consumption in sniff mode. First of all, we discussthe BT operations and acquire the expression of BT averagepower in sniff mode. Secondly, we use the current unit(microampere) to scale the BT’s average power and calculate itusing the acquired expression. Thirdly, the analysis showssignificant saving when a BT chip is in the slave role using thesniff mode even when using short sniff intervals, for instance asniff interval T sniff is 40 ms, the slave role can save 96.6% power,which is in comparison to the master’s power consumed whenN poll =2 slots on the ACL links. Given that the slave’s clock needsre-synchronization and considering the worst case clock drift andthe response time, the longest supported T sniff is notrecommended. The impacts of the N sniff_attempt and T sniff parameterand the recent introduction of sniff sub-rating are alsoconsidered.Keywords-component: Bluetooth, Sniff Mode, Power saving,Power Consumption, Sniff sub-rating;I. INTRODUCTIONBluetooth [1] is a popular short-range wirelesscommunications system, operating in the unlicensed 2.4 GHzISM (Industrial Scientific Medical) band and its key featuresare robustness, low power, and low cost. Currently, Bluetoothis being applied to more and more applications in diversefields, for example PC accessories, mobile phone accessories,home and entertainment devices, medical and fitness sensors,automotive industry and automation – factory and industrialsensors etc.Increase with the Bluetooth application widely, more andmore portable devices have a Bluetooth module. These deviceshave a large battery to supply for up to a week or more. AsBluetooth module is the main data transmission modulebetween devices, its energy consumption problem is gradualemerged. This paper presents the BT power consumption studyof the standard specified sniff mode.The paper is structured as follows. Section II gives a shortsummary of Bluetooth low power modes. Section III explainsthe sniff mode operation in detail. In section IV, we discuss theexpression of BT average power in sniff mode. Section V weevaluate the BT’s power and analyze the results. Finally,section VI presents our conclusion.II. BLUETOOTH LOW POWER OPERATIONS OVERVIEWBluetooth provides various low-power operations tomanage power consumption. At the microscopic level, theoperation of packet handling and slot occupancy must beminimized but in accordance with the core specification. Thebasic idea is to reduce information exchange between theBluetooth devices, and allow the transmitter and receiver toreturn to sleep if possible.At the macroscopic level, the basic idea which realizes lowpower is to adopt low power operation that reduces the dutycycle of the Bluetooth devices. It has three low poweroperation modes: sniff, hold and park, all of which are optional.In sniff mode, the master and slave agree periodic anchorpoints where they will communicate. Consequently, theBluetooth devices negotiate a sniff interval (T sniff ) and a sniffoffset (D sniff ) in the Asynchronous Connectionless Link (ACL)logical transport which is used largely to carry data as opposedto voice. The master shall only start a transmission to the slavein the specified sniff attempt slots and the slave may return tosleep in the remaining slots of the T sniff period. The sniff subrating(SSR) allows both the master and slave to increase thetime between sniff anchor points, which could further reducepower consumed by link management in sniff mode.In hold mode, the slave shall not support ACL packets onthe piconet’s channel. A timer shall be initialized with thetimeout value holdTO. When the timer expires, the slave shallwake up, synchronize to the traffic on the channel and will waitfor further master transmissions. During sniff and hold mode,the slave device keeps its logical transport address(LT_ADDR).A slave in Park mode does not need to participate on thepiconet’s channel, but still needs to remain synchronized to thechannel. The slave shall give up its logical transport addressLT_ADDR and shall receive two new addresses to be used inthe park state.Most research work focuses on the sniff mode and it iswidely recommended for low-power operation. In [2] [3], theauthors proposed a sniff scheduling scheme for power saving.In our paper, we will discuss the Bluetooth power consumptionin sniff mode within different sniff intervals.This publication has been supported by the Irish Research Council forScience, Engineering and Technology (IRCSET) and the Wireless AccessResearch Centre, University of Limerick, Ireland.201


III.BLUETOOTH OPERATIONS IN SNIFF MODESniff mode is the most common and flexible method forreducing Bluetooth’s power consumption. The operations areas follows.A. LMP and HCI Commands OperationsThe Bluetooth core system consists of a Host and one ormore Controllers [1]. A Host is defined as all of the layersbelow the profiles and above the Host Controller Interface(HCI). A Controller is defined as all of the layers below theHCI.The Link Manager (LM) controls how the Bluetoothpiconets and scatternets are established and maintained by theLink Control commands. The Host Controller Interface (HCI)provides a uniform command method of accessing controllercapabilities.To enter sniff mode, both the master and slave can start anegotiation through the Link Manager Protocol (LMP)messages, commonly referred to as LMP protocol data units(PDUs). The process is initiated by sending an LMP_sniff_reqPDU containing a set of parameters, which includes timingcontrol flags, D sniff , T sniff , N sniff_attempt , N sniff_timeout . The receivingLM shall then decide whether to 1) reject the attempt bysending an LMP_not_accepted PDU; 2) suggest differentparameters by replying with an LMP_sniff_req PDU; or 3) toaccept the request. The negotiation is shown at Fig.1.Figure 2. Sniff mode request by LMP and HCI commands [1]B. Transmitter and Receiver OperationsWhen a slave enters sniff mode, it needn’t listen at everyreceive slot (Rx) and its receiver may go to sleep until the nextanchor point. There are two key parameters in sniff mode:N sniff_attempt and N sniff_timeout . These parameters specify the numberof baseband receive slots for sniff attempt and timeout,respectively. The slave continues to listen from the anchorpoint for the specified number of sniff attempts, and if it hasreceived a packet addressed to it, then it may continue to listenfor more packets up until the specified timeout. The slave’sreceiver prepares to receive packets from the master at thescheduled sniff receive slots and do some operations based onthe content of received packets or pending data fortransmission.Fig.3 gives an overview of the operations of the slavetransmitter and receiver.Figure 1. Negotiation for sniff mode [1]The HCI_Sniff_Mode command is used to alter thebehavior of the LM and have it place the ACL basebandconnection associated with the specified Connection Handleinto the sniff mode. The HCI_Exit_Sniff_Mode command isused to end the sniff mode for a Connection Handle, which iscurrently in sniff mode.The HCI_Sniff_Mode command has six commandparameters, which are Connection_Handle, Sniff_Max_Interval,Sniff_Min_Interval, Sniff_Attempt and Sniff_Timeout. Note theHCI_Sniff_Mode parameters include a min and max sniffinterval, allowing the LMP a degree of flexibility in selectingthe T sniff period.An example of a sniff mode request which is accepted isshown Fig.2.Figure 3. The slave’s operations in sniff modeFrom Fig.3, we observe that the slave has a recover timing(RT) period at the sniff anchor point. The reason is the slaveruns using its native clock (CLKN) in sniff mode and loosessynchronization while sleeping. The master clock (CLK) isused for all timing and scheduling activities in the piconet. Theslave maintains its own approximation to the master clock, butdue to timing jitter and time drift between the respective clocks,it must continuously resynchronize. Hence, an uncertaintywindow is defined around the exact receive timing. The slaveshall not recover the master timing until it receives a packetincluding the piconet access code from the master at the sniffattempt slots. The slave’s recover timing operation is shown in202


Fig.4 where the recover timing window is centred around theslaves estimation of the sniff anchor point time.Figure 4. The slave’s recover timing operation in sniff anchor pointIV.BT AVERAGE POWER EXPRESSION IN SNIFF MODEThe Bluetooth’s power consumption in sniff mode will bescaled by the average power during sniff interval. We analyzethe different states when the Bluetooth device is in sniff modeand calculate the time spent in each state ( t i) [4]. Therefore theaverage power can be expressed byPavg∑ Pi*ti=∑tWhere P i represents the power consumption in state i.A. Slave’s Average PowerIn sniff mode, the CLKN may be driven by a low poweroscillator with worst case timing accuracy (specified in thestandard as drift=+/-250ppm and jitter=10μs), the slave’s Rxrecover timing period (t RT ) should be considered. The drift timeparameter (t drift ) and jitter time parameter (t jitter ) will incurpower consumption due to resynchronization. The RT power insniff mode P RT (S) is the averaged power in a sniff interval,which is as shown in:P ( ) ( ) ( )RTS = PdriftS + PjitterSi(1)(2* drift * Tsniff + 2* tjitter)* iRx= * V (2)TS is the slave. T sniff is the sniff mode interval; i Rx is thecurrent of a device in slave role when receiving; V is thevoltage for the specific Bluetooth chip.To simplify the analysis, we set N sniff_timeout =0. The t slot ’slength is 625μs. Although it is possible to go to idle after thepacket has been received, we will assume that it remainsreceiving for the full slot. Therefore, from the sniff descriptionof part III, the slave’s Rx consumed average power P RX (S) insniff mode issniff( Nsniffattempt * tslot )* iRxPRx( S) = * V(3)TsniffIf the master has no traffic to the slave during the sniffintervals, it is recommended that single slot packets aretransmitted by the master during the slave re-synchronization.Therefore the slave should be sent a POLL or NULL packetwhich includes the piconet access code so that the slave shallkeep synchronized to the channel. The slave’s transmitter willacknowledge, at the corresponding transmission slot (Tx) e.g.using a NULL packet if no command or data needs to be sent.The slave in the other Tx opportunities typically won’t sendany packet if it has nothing to send. The slave’s Tx consumedaverage power P TX (S) in sniff mode isiTx * tslot + (( Nsniffattempt −1)* tslot )* iTx _ idlePTx( S) = * V (4)TAfter N sniff_attempt, the slave enters sleep until the next sniffanchor point, which means the lowest power consumptionwhile the slave is in connection state. The sleep consumedaverage power in sniff mode issnifftsleep = Tsniff − (2* drift * Tsniff + 2* tjitter) − Nsniffattempt *2* tslottsleep* isleepPsleep( S) = * V(5)TsniffThe average power in sniff mode for the basic scenarioconsidered is the sum of every operations average power,which is given by:P ( S) = [ P ( S) + P ( S)] + P ( S) + P ( S)(6)avg RT Rx Tx SleepB. Master’s Average PowerThe master has three typical operations in sniff mode on theACL link. The first and worst case power consumption is whenthe master sends a POLL packet to the slave and receives anNULL packet in return and continues to do so in sniff intervals,which means the effective sniff interval T sniff =1.25ms(2 slots)and the master’s slots are always involved (N poll =2 slots) on theACL links and the average power is1 1i * T + i * TP2 2_( ) Tx sniff Rx sniffavg onM =* VTsniff1= ( iTx + iRx)* V(7)2M is the Master.The second is when the master works as in normal ACLwith no data traffic to send in sniff intervals, which means themaster only sends POLL or NULL packets within the pollinternal N poll , which is default 40 slots. The average power is⎛( iTx + iRx ) + iidle *( Npoll−2)⎞Pavg_ poll( M) = * V⎜N⎟⎝poll ⎠The third typical operation is when the master is on theACL link only with data transfer e.g. a file transfer, whichmeans the master slots are used for other logical transport(8)203


traffic or for POLL and NULL packets to the slave whichmight be in active mode or sniff mode. The approximate powerconsumption of this work state is between P avg_on (M) andP avg_poll (M).Fig. 5 illustrates an example of Bluetooth operation duringsniff for both master and slave devices. Idle denotes that thereceiver attempts to receive but after the short receive window,realizes that there is no packet being transmitted and changes tothe idle state.B. Power Consumption AnalysisFrom Table 1, we observe that if T sniff =1.25ms (2 slots), theBT chip of the slave won’t save power in sniff mode and theslave’s power will be consumed more than the master’s powerwhich in slots are always involved (N poll =2slots) on the ACLlinks. Fig. 6 compares and analyzes the master and slave’spower calculation results for different sniff intervals.Figure 5. An example of Bluetooth theoretical power in sniff modeV. POWER CONSUMPTION CALCULATION AND ANALYSISA. Calculate Average PowerThe power consumption of a Bluetooth module is the sumof that of the BT chip and other parts of the module e.g.microprocessor or wired communication devices. If BT chipworks in sniff mode, other parts of module e.g. three-lineUARTs could be set to sleep to save power. We only considerthe power saving of a BT chip in sniff mode in the paper.Different Bluetooth chips have different Tx or Rx currentand voltage parameters. A simplifying assumption is thatvoltage parameter is a normalized constant (e.g. V=1) and usethe current’s unit (microampere) to scale Bluetooth’s power.From [5] [6] [7], we obtain indications of the range of thecurrent parameters of Bluetooth. Therefore we can estimate thevalues of the parameters, which are representative of realdevice values. We set the parameters as follows: i Tx =22mA,i Rx =18mA, i Tx_idle =i idle =4mA, i sleep =40μA, N sniff_attempt =1 andN sniff_timeoutt =0. Considering the time drift parameter is variableand isn’t always specified at the max value. We set its averagevalue drift avg =0.5*drift max =125ppm.Therefore, we can make use of (2)-(8) to calculate averagepower. The result is as follows:Figure 6. Average power of BT chip in sniff mode, T sniff =[1.25ms,40ms]If T sniff =40ms, the chip of slave will save 96.6% powerconsumption, compared to the master’s average powerconsumed using N poll =2slots on the ACL links. The slave’spower calculation results for the sniff mode with different sniffintervals are depicted in fig. 7.TABLE I. BT CHIP’S AVERAGE POWER BASED ON CURRENTCONSUMPTION ON ACL LINKS WITH SNIFF MODEConnectiontypeOperations Mode(on ACL links)Average Unit PS ( )PM ( )(%)master N poll = 2 slots 20.0 mA -master N poll = 40 slots(default) 4.80 mA -slave T sniff = 1.25 ms 20.29 mA 101%slave T sniff = 40 ms 0.677 mA 3.4%slave T sniff = 1.28 s 0.0642 mA 0.32%slave T sniff = 40.9 s 0.0454 mA 0.23%Figure 7. Average power of BT chip in sniff mode, T sniff =[40ms,1.28s]If T sniff = 40.9s and N sniff_attempt =1, the slave role will savemore power in theory. In reality, devices should not adopt thelongest T sniff to save power even when the application allows it.There are a few things that must be considered besides powersaving.First of all, we must consider the CLK time drift of slaveand the channel conditions. If in perfect synchronization then204


the slave device always receive the master to slavetransmission at the sniff anchor point. However, the slaveCLKN time drift could result in 2*drift*T sniff ≈10-20ms driftin the case T sniff = 40.9s; whereby the slaves might lose timingand shall require re-synchronization before it may sendinformation. The re-synchronization of the slave consumespower dependent on the length of a single search window andthe duration of the search.Simultaneously, the slave may lose traffic from the masterdepending on channel conditions. In the worse case channelconditions it is possible that an LMP supervision timeout willbe reached (max 40.9s if enabled) causing the ACL connectionto be lost. Thus, it is recommended that the parameterN sniff_attempt *t slot is larger than 2*drift max *T sniff in sniff mode toimprove the probability that the slave receives a packetscheduling from master during the sniff attempt windows. Theslave will consume more power than the above calculationindicates in the longer sniff intervals.Secondly, for most BT applications it is important that thedevice has a rapid response time and minimizes latency whencommunicating to the other device. The longer the sniffinterval will incur the longer waiting time when the deviceneeds to transmit data. Some applications will switch betweensniff mode and active mode with high frequency. Consequently,if the power saving of BT is only the BT chip in sniff mode, itis meaningless to improve the power saving from 96.6% to99.9% by using the longest sniff interval.Last but not least, sniff sub-rating provides a mechanism toincrease the time between sniff anchor points. Even if SSRincreases the response time in the normal case the advantage ofSSR is that when a packet is missed at a SSR anchor point theslave can listen and recover synchronization at the next sniffanchor points.Finally, there are implications when selecting the value ofthe N sniff_attempt parameter. From a power consumptionperspective, Fig.8 shows the impact of N sniff_attempt and T sniff .VI. CONCLUSION AND OUTLOOKThis paper introduced the expressions for average BTpower consumption through analysis of the BT’s operations insniff mode. We evaluated the BT average power in sniff modewith different sniff intervals. The analysis shows the BT chip inslave mode can save 96.6% of its power consumption while inthe sniff mode based on realistic current parameters and settingT sniff =40ms. The saving is in comparison to the master’s powerconsumed when N poll =2slots on the ACL links. The more theslave enters sniff mode with an appropriate T sniff , the more theslave can save power.In order to reduce the probability of link disconnections, theappropriate value of T sniff and N sniff_attempt should be determinedby the channel conditions. The longest T sniff is notrecommended in this paper, its value will be determined byconsidering the application’s required response time andwhether sniff sub-rating is available.Most current BT based devices do not achieve thesesavings, and the resultant extended battery life, as it requiresthat other chips in a BT module must also be put into deepsleep during the sniff sleep period e.g. UART devices andmicroprocessors. This requires cooperation between the hostand host controllers, and some of the BT modules do supportthis.REFERENCES[1] Bluetooth SIG, "Specification of the Bluetooth system".21 st , April,<strong>2009</strong>, Version 3.0+HS;https://www.bluetooth.org/Technical/Specifications/adopted.htm[2] Ting-Yu Lin; Yu-Chee Tseng, “An adaptive sniff scheduling scheme forpower saving in Bluetooth”,Wireless Communications, IEEE [see alsoIEEE Personal Communications] Volume 9, Issue 6, Dec. 2002Page(s):92 - 103[3] Li Xiang, Yang Xiaozong; “A sniff scheduling policy for power savingin Bluetooth piconet”, Parallel and Distributed Systems, 2005.<strong>Proceedings</strong>. 11th International Conference on Volume 1, 20-22 July2005 Page(s):217 - 222 Vol. 1[4] Alexandre Lewicki, Javier Del Prado Pavón, Jo Degraef,JackyTalayssat,and Gilles Jacquemod, “Power Consumption Analysis of aBluetooth over Ultra Wide Band System”, Ultra-Wideband, 2007.ICUWB 2007. IEEE International Conference on, 24-26 Sept. 2007Page(s):241 - 246[5] Paul van Zeijl, Jan-Wim Th. Eikenbroek, Peter-Paul Vervoort, SumaSetty, Jurjen Tangenberg, Gary Shipton, Eric Kooistra, Ids C. Keekstra,Didier Belot, Klaas Visser, Erwin Bosma and Stephan C. Blaakmeer. “ABluetooth radio in 0.18-μm CMOS”, Solid-State Circuits, IEEE Journalof Volume 37, Issue 12, Dec. 2002 Page(s):1679 - 1687[6] BlueCore4-External Data Sheet (BC417143B-ds-001P);https://www.csrsupport.com/document.php?did=1539[7] LMX9838 Bluetooth Serial Port Module;https://www.national.com/ds.cgi/LM/LMX9838.pdfFigure 8. Average power of BT chip in sniff mode, T sniff =[1.28s,40s]205


Necessity for an Intelligent Bandwidth Estimation Technique overWireless NetworksZhenhui Yuan, Hrishikesh Venkataraman and Gabriel-Miro MunteanSchool of Electronic Engineering, Dublin City University, Dublin, IrelandE-mail {yuanzh, hrishikesh, munteang}@eeng.dcu.ieAbstractWith the development of broadband systems,multimedia communication in wireless networks hasbecome very common. However, supporting multimediaapplications over multiple users either require highbandwidth or a dynamic bandwidth utilizationmechanism. This in turn calls for an efficient technique,that can estimate the available bandwidth in thenetwork accurately over real-time. In this paper, thedifferent estimation algorithms are analyzed and theperformance of state-of-the-art estimation algorithm,‘Spruce’, is evaluated by comparing with the actualmeasured bandwidth. The key characteristic found wasthat the bandwidth estimation technique alwaysproduces some error which results in inaccurateestimation. This brings about the necessity for anintelligent estimation technique that would not onlyoffset the inaccuracy resulting from the self-use of thebandwidth, but also minimize the bandwidth used bythe estimation algorithm.I. IntroductionIn the recent years, there has been rapid growth ofInternet-based services over the wireless network. Withthis, more and more demands are being placed on theperformance of the network. The end-users demand thatconsistent monitoring of the performance is carried out,in order to both detect faults quickly and also predictand make provision for the growth of the network.Measuring the performance of the Internet overwireless network is extremely difficult. Even with thecomplete support of the different Internet serviceproviders (ISP), the complexity of the network meansthat normally multiple providers are involved in theend-to-end connection between hosts. This situationmakes the monitoring of end-to-end performance byany one ISP nearly impossible. In addition, the wirelessnetworks, especially the cellular networks providingboth voice and video require accurate bandwidthestimation to optimize the network performance [1].This puts a great demand for new tools that wouldenable the end-users and the service providers to assessthe performance of the wireless network, especially thenetwork bandwidth, without any external assistance.There are significant constraints in the development ofsuch tools. Importantly, these tools need to rapidly andeasily measure the end-to-end performance of thenetwork, while not placing any additional load on thenetwork than is absolutely necessary. It should be notedthat any extra load would restrict the times that themeasurement could be made, and depending on thetopology of the wireless network, it could create largeextra traffic charges.A large amount of time and energy is currently beingspent for researching on high speed, next generationnetworks [2]. These networks are being constructed inorderto support the large growth in the Internet, as wellas enabling high bandwidth services to run over thenetwork to more people. There is an increasing demandin the industry to find out whether the performanceobtained from these networks is what is expected fromthem. With this regard, there has been much work ondeveloping techniques for estimating the capacity andavailable bandwidth of network paths based on endpointmeasurements. Bandwidth is a key factor inseveral network technologies. Several applications canbenefit from knowing bandwidth characteristics of theirnetwork paths. The motivation behind bandwidthestimation has been the potential for applications andend-host-based protocols to take advantage ofbandwidth information in making intelligent choices onserver selection, TCP ramp-up, streaming mediaadaptation, etc [3].In this paper, the different bandwidth estimationtechniques that have been proposed and used forwireless networks have been analyzed. Especially, theperformance of the state-of-the-art technique, “Spruce”is analyzed in detail, and its advantages and206


shortcomings are discussed. In addition, a novelintelligent estimation technique is introduced,especially the characteristics required to make it anefficient method.The paper is organized as follows. Section IIdescribes the related work, whereas Section IIIdescribes the performance of the state-of-the-artbandwidth estimation method, and its characteristics.Section IV describes the experimental set-up that hasbeen built, while Section V describes the simulationresults. Section VI introduces a novel intelligentbandwidth estimation technique. Finally, Section VIIprovides the conclusions and the future work in thisdirection.II. Related WorkRecently, bandwidth estimation techniques have drawnwidespread interests in network management arena. Acouple of bandwidth estimation techniques have beenbased on the packet-pair principle [4, 5]. However, theinitial versions of such techniques did not consider theproblem of cross-traffic interference. In order toalleviate this problem, various refinements have beenproposed, that includes - sending trains of packets ofvarious sizes (e.g., bprobe [6]) and better filteringtechniques to discard incorrect samples: for example,nettimer [7]. However, the filtering technique is madecomplex by the multi-modality of the distribution ofpacket-pair spacing [8] and with the observation thatthe dominant mode might not correspond to the actualnetwork bandwidth [9]. There are several otherbandwidth estimation techniques, that were proposed inthe early years of research in wireless networks - suchas cprobe [6], symptotic dispersion rate [9] etc. Manyof the recently proposed techniques fall into twocategories: packet rate method (PRM) and packet gapmethod (PGM). PRM-based tools, such as pathload[10], PTR [11], pathchirp [12], and TOPP [13], arebased on the observation that a train of probe packetssent at a rate lower than the available bandwidthThe current research on bandwidth estimationalgorithms could be classified into three categories [14],[15]: packet dispersion measurement (PDM), probegap model (PGM) and probe rate model (PRM). ThePDM techniques, such as the packet pair or packet train,estimates network capacity by recording the packetinter-arrival time. However, the main disadvantage ofPDM-based technique is that they have very lowaccuracy when applied to the wireless networks. Thebasic principle of PGM is that the server sends a probepacket pair with time dispersion, T in, and aftersuccessful transmission, the receiver records a changeddispersion time, T . The value, T - T is then the timeout out infor transmitting crossing traffics under the conditionthat a single bottleneck link is assumed. The crossingtraffic rate, BWc, could be written as BW c = (Tout - T in) xC/T in, where C is the capacity of the network. Hence,the estimated available bandwidth would be C – BW c.However, the main disadvantage of PGM is that itassumes that the network capacity is known, and thatthis would supply fast as well as a certain accuracy ofestimation. In reality, however, the network capacity isnot always known beforehand. The PRM techniquesestimate bandwidth using three kinds of traffic rates:sender-side probing rate (C s), receiver-side probing rate(C ) and available bandwidth (BW).rIII. State-of-the-art Bandwidth EstimationIn terms of measuring the kind of bandwidth in thenetwork, most of the proposed techniques concentrateon measuring one of two values - either the individuallink bandwidths of a path, or the capacity of a path. Ingeneral, these techniques can be classified into twogroups: Single packet and packet pair techniques. Thenames refer to the number of packets that are used in asingle probe. A measurement of a link or path willconsist of multiple probes, in the case of someimplementations [16], this can be in the order of 10MBof data (14400 individual probes) to measure a 10 hoppath. The following sections will detail the theory ofthese techniques, improvements suggested and exampleimplementations.a. Single packet techniques: This methodconcentrates on estimating the individual linkbandwidths as opposed to end-to-end properties.These techniques are based on the observation thatslower links will take longer to transmit a packetthan faster links. If it is known how long a packettakes to cross each link, the bandwidth of that linkcan be calculated.b. Packet pair technique: This method attempt toestimate the path capacity not the link capacitydiscovered by single packet techniques. Thesetechniques have been in use since at least 1993,when Bolot [17] used them to estimate the pathcapacity between France and the USA. He was ableto quite accurately measure the transatlanticcapacity, which at that time was 128kbps. Packetpair techniques are often referred to as packetdispersion techniques. This name is perhaps moredescriptive. A packet experiences a serializationdelay across each link due to the bandwidth of thelink. Packet pair techniques send two identicallysized packets back-to-back, and measure thedifference in the time between the packets whenthey arrive at the destination.207


Spruce: Spruce has been one of the most successfulbandwidth estimation techniques under the packet pairtechnique. Spruce has been found to be significantlysuperior to other methods like Pathload and IGI [18].The technique of Spruce is explained, in detail,followed by the experimental results in the next section.Spruce (Spread Pair Unused Capacity Estimate) is atool for end hosts to measure available bandwidth. Itsamples the arrival rate at the bottleneck by sendingpairs of packets spaced so that the second probe packetarrives at a bottle-neck queue before the first packetdeparts the queue. Spruce then calculates the number ofbytes that arrived at the queue between the two probesfrom the inter-probe spacing at the receiver. Sprucecomputes the available bandwidth as the distancebetween the path capacity and the arrival rate at thebottleneck. Spruce is based on PGM. Like other PGMtools, Spruce assumes a single bottleneck that is boththe narrow and tight link along the path.Some of the characteristics of Spruce thatdistinguishes it from other bandwidth estimation toolsare explained below.1. Spruce uses a Poisson process of packet pairs insteadof packet trains (or chirps). This form of sampling doneby Spruce makes it both non-intrusive and robust.2. With the help of a careful parameter selection,Spruce ensures that the bottleneck queue is not emptiesbetween the two probes in a pair, which is aprerequisite for having the correctness of gap model.3. Spruce distinguishes capacity measurement clearlyfrom available bandwidth measurement. Spruceconsiders that the capacity can be measured without anydifficulty with one of the capacity measurement tools.In addition, it assumes that the capacity remains stablewhen measuring the available bandwidth. Thisassumption holds for all scenarios for which Spruce hasbeen designed for estimating the bandwidth of the pathsin overlay networks.In the next section, the performance of Spruce isanalyzed for computing the available bandwidth in realnetwork settings.IV. Experimental SetupFig. 1 shows the simulation topology wheremultimedia applications send multimedia and crossingtraffics to clients via a wired network as well as a lasthop WLAN. Traffic servers send crossing traffics toshare the bottleneck from AP to clients.208Fig. I Simulation TopologyIn the experiment, it is assumed that IEEE 802.11bWLAN is the bottleneck link on the end-to-end path.The WLAN has the smallest available bandwidth whichis also the end-to-end available bandwidth.Table I summarizes the configuration setup in NS.Two additional wireless update package are introduced,NOAH 1 and Marco Fiero Package 2 . NOAH package(No Ad-Hoc) is used for simulating infrastructureWLAN and Marco Fiero Package provides a morerealistic wireless network environment. As a result, inour experiment, there are four degrees of bandwidth - 1,2, 5.5 and 11Mbps, depending on the distance from AP.Fig. 2 shows the characteristic of the real IEEE 802.11bnetwork.Fig. 2 Signal Strength Around Access PointW min and W max are the minimum and maximumvalues of contention window. Basic rate, sending rateof control packets (ACK, RTS, CTS), is set as 1Mbps.In our experiment, six separate tests were conducted.Each test consists of one to three unicast video trafficsand one client starts moving from 5s at the speed of1 http://icapeople.epfl.ch/widmer/uwb/ns-2/noah/2 http://www.telematica.polito.it/fiore/ns2_wireless_update_patch.tgzTransport ProtocolUDPWireless protocol 802.11bRouting protocolNOAH


Error ModelMarco Fiero packageWired Bandwidth100Mbps LANMAC header52 bytesWmin 31Wmax 1023ACK38 bytesCTS38 bytesRTS44 bytesSIFS10µsecDIFS50µsecBasic rate1MbpsTable 1. Simulation Setup in NS-2.291m/s. Variable network conditions were also introducedand realized by varying current traffic loads. This isdone by generating CBR/UDP crossing traffics using1500 bytes packet. Additionally, the number of videotraffics increases in each separate test. Along with theincreasing loads of traffics, the network becomescongested. This set is to verify how the performance ofSpruce works under heavy network condition.V. Experimental ResultsThis section studies the performance of Spruce bycomparing it with Measured Bandwidth. MeasuredBandwidth is based on the concept of maximumthroughput that an application can obtain. It depends onthe transmission mechanism like TCP, UDP.Fig. 3 Comparison of bandwidth calculated frommeasured and spruce with no crossing traffic.(One server and one client)Fig. 3 and Fig. 4 show the comparison results ofMeasured Bandwidth (calculated from trace result ofNS-2) and Estimated Bandwidth (Spruce) for periods of0 and 200 seconds without cross traffics. The Sprucetraffic was started from 3s. Spruce probing traffic usedCBR/UDP flow to send packets of 1500 bytes with therate of 0.15Mbps.Fig. 4 comparison of bandwidth calculated frommeasured and spruce without crossing traffic.(One server and two clients)The first test consisted of one server and one client asprovided in Fig. 3. A video clip of two hundredseconds was transmitted to client via high speed(100Mbps) wired network and IEEE 802.11b WLAN.The client started moving away from AP from 2s at thespeed of 1m/s. Since Marco Fiero package wasimplemented, bandwidth dropped when the distancebetween mobile client and AP increased. As seen inboth Fig. 3 and Fig. 4, both the measured and estimatedbandwidth fluctuated considerably at around 80s and130s due to interference of incoming cross traffics. Inorder to discover the performance of bandwidthestimation, average bandwidth is introduced. In Fig. 3,the average bandwidth estimated by Spruce was1.51Mbps, notably different from the measuredbandwidth of 2.96Mbps. Thus, an error of 1.45 (47%)was observed with Spruce. However, it was observedthat Spruce better during the initial time duration (thefirst 80 seconds). For the same configuration as for Test1, another multimedia server and client pair were addedin Test 2, and the results could be seen in Fig. 4. Twovideo clips with the same size were transmitted toclients in terms of unicast traffic streams. The error incase of Spruce was 1.63 (25%). Hence, Spruceperformed considerably in case of heavy trafficcondition (two clients).Fig. 5, 6, 7, 8 provide the simulation results with theparticipation of crossing traffics. In order to have faircomparison, the Spruce probing traffic was added as inTest 1 and 2. The results in Fig. 5 were obtained withtwo video traffics and one crossing traffic. The videotraffics were scheduled to start transmission at 2s and30s, and the crossing traffic began at 50s. The incomingof traffics resulted in changes of estimated bandwidthas shown in the figure. Fig. 6, 7 and 8 show the resultswhen the number of crossing traffic and video trafficincreased.209


Fig. 5 Comparison of bandwidth calculated frommeasured and spruce without crossing traffic.(Two clients and one cross traffic)Fig. 6 Comparison of bandwidth calculated frommeasured and spruce without crossing traffic.(Two clients and three cross traffics)Fig. 8 Comparison of bandwidth calculated frommeasured and spruce without crossing traffic.(Three clients and three cross traffic)As the demand for performance on the Internetgrows, so does the requirement for tools to accuratelymeasure performance. This growing demand alsomeans that solutions that place a large load on thenetwork would not be able to scale. This in fact createsan urgent need for having tools that can accuratelyestimate various types of bandwidths. Also, suchtechniques need to estimate the bandwidth accuratelywithout creating large volumes of traffic.An intelligent bandwidth estimation (iBE) techniqueis being researched by our team that would reduce theerror between the measured and the estimatedbandwidth. The basic idea of iBE is to use thedifference between the packet’s transmission time andreception time at MAC layer. The actual algorithm andthe mechanism of iBE are not explained herecompletely; as it is still under research. The initialresults are shown in Table II. It can be observed fromTable II that for CBR/UDP traffic of 0.5 and 1.0 Mbpsdata rate for different video clients, the iBE showssignificantly less error with respect to the actualmeasured bandwidth, as compared to Spruce.VII. Conclusion and Future WorkFig. 7 Comparison of bandwidth calculated frommeasured and spruce without crossing traffic.(Two clients and two cross traffics)VI. Necessity for Intelligent EstimationMethodIt can be observed from the experimental results inSection V that the performance of Spruce can beoffset by up to 50% as compared to the actualmeasured bandwidth. In practice, on an average, theperformance of Spruce is offset by 30%.This paper reviews the different categories ofbandwidth estimation techniques for wireless networks.Single pair and packet pair were the two prominentkinds of estimating bandwidth for such networks. Astate-of-the-art packet-pair estimation technique,“Spruce” was described and analyzed for differentkinds of Internet-based multimedia traffics. It wasfound over different conditions that “Spruce”though satisfactory most of the times, was found to giveerrors, as much as up to 50%.A new intelligent bandwidth estimation algorithm formultimedia delivery over wireless networks has beenresearched in the recent years. The initial results haveshown ‘intelligent technique’ to give results much210


#Bandwidth (median, Mbps)Video Crosserrorclients Traffic measured iBE SpruceiBE Spruce1 1 None 2.96 3.52 1.51 0.56 1.452 2 None 3.12 3.41 1.49 0.29 1.633 2 CBR/UDP 0.5Mb/s 2.72 2.67 1.62 0.05 1.14 2CBR/UDP 0.5Mb/sCBR/UDP 1.0Mb/s2.63 2.25 1.58 0.38 1.055 2CBR/UDP 0.5Mb/sCBR/UDP 1.0Mb/s 2.48 2.23 1.51 0.25 0.97CBR/UDP 1.0Mb/s6 3CBR/UDP 1.0Mb/sCBR/UDP 1.0Mb/s 2.45 2.31 1.26 0.14 1.19CBR/UDP 1.0Mb/sTable II. Bandwidth Estimation Performance of iBE, Spruce and comparison with actual Measurementcloser to the actual measured bandwidth. Further workin this direction is to fully develop the intelligentbandwidth estimation method, and to test itsperformance against different multimedia basedapplications.AcknowledgmentsThe authors would like to acknowledge the supportof China Scholarship Council.References[1] H. Venkataraman, S. Sinanovic and H. Haas,“Performance Analysis of Hybrid Cellular Networks”, InProc. of IEEE PIMRC, Berlin, Germany, September, 2005.[2] J.Curtis, T.McGregor, “Review of Bandwidth EstimationTechniques”, Department of Computer Science, Universityof Waikato, Hamilton, New Zealand, 2001.[3] M. Jain and C. Dovrolis, “End-to-End AvailableBandwidth: Measurement Methodology, Dynamics, andRelation with TCP Throughput”, In Proc. of ACMSIGCOMM, Pittsburgh, PA, USA, August 2002.[4] V. Jacobson and M. J. Karels. “Congestion Avoidanceand Control”, In Proc. of ACM SIGCOMM, Stanford, CA,USA, August 1988.[5] S. Keshav, “A Control-Theoretic Approach to FlowControl”, In Proc. of ACM SIGCOMM, Zurich,Switzerland, September 1991.[6] R. L. Carter and M. E. Crovella, “Measuring BottleneckLink Speed in Packet Switched Networks”, PerformanceEvaluation, Elsevier, Vol. 27-28, pp. 297-318, October,1996.[7] K. Lai and M. Baker, “Measuring Link BandwidthsUsing a Deterministic Model of Packet Delay”, In Proc. ofACM SIGCOMM, Stockholm, Sweden, August 2000.[8] V. Paxson, “Measurement and Analysis of End-to-EndInternet Dynamics”, PhD Thesis, University of California atBerkeley, 1997.[9] C. Dovrolis, D.Moore, and P. Ramanathan, “What DoPacket Dispersion Techniques Measure?” In Proc. of IEEEINFOCOM, Anchorage, Alaska, USA, April, 2001.[10] G. U. Keller, T.E. Najjary and A. Sorniotti,“Operational Comparison of Available BandwidthEstimation Tool”, ACM SIGCOMM ComputerCommunications, Vol. 38, No. 1, January 2008.[11] N. Hu and P. Steenkiste, “Evaluation andCharacterization of Available Bandwidth Techniques”,.IEEEJournal of Selected Areas in Communications (JSAC),Special Issue in Internet and WWW Measurement, Mappingand Modeling, Vol. 21, No. 6, pp. 879-894, August 2003.[12] V. J. Ribeiro, R. H. Riedi, R. G. Baraniuk, J. Navratil,and L. Cottrell, “PathChirp: Efficient Available BandwidthEstimation for Network Paths”, In Proc. of (Passive andActive Measurements Workshop, Ohio, USA, April 2003.[13] B. Melander, M. Bjorkman, and P. Gunningberg, “ANew End-to-End Probing and Analysis Method forEstimating Bandwidth Bottlenecks”, In Proc. of IEEE GlobalInternet Symposium, SanFrancisco, USA, November, 2000.[14] T. Sun, G. Yang, L. J. Chen, M.Y. Sanadidi, and M.Gerla, “A Measurement Study of Path Capacity in 802.11bbased wireless network,” in Wireless Traffic Measurementand Modeling (WiTMeMo), Seattle, USA, June 2005.[15] L. Angrisani, A. Botta, A. Pescape, and M. Vadursi,“Measuring Wireless Links Capacity,” 1 st InternationalSymposium on Wireless Pervasive Computing, Phuket,Thailand, January 2006.[16] V. Jacobson, “pathchar - A Tool to InferCharacteristics of Internet Paths”, Presented at theMathmatical Sciences Research Institute, December, 1997.[17] J.C. Bolot, “End-to-End Packet Delay and LossBehaviour in the Internet”, in <strong>Proceedings</strong> of ACMSIGCOMM, SanFrancisco, USA, September, 1993.[18] J. Strauss, D. Katabi and F. Kaashoek, “A MeasurementStudy of Available Bandwidth Estimation Tools”, InternetMeasurement Conference, <strong>Proceedings</strong> of the 3rd ACMSIGCOMM conference on Internet measurement, Miami Beach,FL, USA, pp. 39-44, 2003.211


Section 5BCOMPUTER NETWORKS 2212


Quality of Service in IMS based Content networkingDalton Li, Jonathan Lvdakeli@alcatel-lucent.com; lv@alcatel-lucent.comAlcatel-Lucent Qingdao R&D Center, China Qingdao 266101AbstractThis paper introduces IP Multimedia Subsystem (IMS) as theinfrastructure of content networking. The proposed IMSbased content networking architecture is enhanced byinteraction between bearer gateway and signaling layer toguarantee the Quality of Service (QoS) of the traffic. A newIMS component - Content Navigating Function (CNF) is alsointroduced to provide content service to the end user. With itswell-designed architecture and maturing specifications, theQoS mechanism of IMS does contribute to traffic managementin the content network. The typical scenario description andhigh level call flow of content sharing sessions are alsoprovided.Keywords: QoS; IMS; Content network.1 IntroductionPeople in large and rapidly growing numbers are creating andsharing their content or commercial content products witheach other through networks. Traffic generated from thissharing represents a significant component of today'scommunication networks traffic. Needless to say, contentnetworking providers concern more and more about how toguarantee Quality of Service (QoS) for this kind of traffic,especially for bandwidth hungry traffic, like multimediastream. Content caching and server clustering have beenusing for web content delivery [1]. And the Content DeliveryNetwork (CDN) [2, 3, 4] has been proposed to improve enduserQuality of Experience (QoE) by replicating content in anetwork of geographically distributed surrogate servers.Request-routing mechanisms [5], content replicationtechniques [6], load balancing and cache management [7]have also been studied. To efficiently implement CDN, theIETF CDI group has proposed Content DistributionInternetworking (CDI) concept to enhance the CDN byaggregating separate CDNs into an integrated one. As theresult, several Request For Comments (RFC) have beenpublished, such as CDI architecture overview [8], its modelsand usage scenarios [9], terminologies and requirements [10].CDN currently focuses on how to distribute the content on thenetwork and how to redirect the request, as well as how toaccount. But it doesn’t consider Service Level Agreement(SLA) that describes the subscription between the end userand the service provider. On the other hand, IP MultimediaSubsystem (IMS) [11, 12], as a standardized Next GenerationNetworking (NGN) architecture and the industry's greatunifying technology, provides a good solution for this.This paper proposes an IMS based architecture for bothcentralized and Peer-to-Peer content network and a new IMSApplication Server (AS) [11] - Content Navigating Function(CNF) dedicated for the content network to guarantee theQoS of the traffic. With its well-designed architecture andmaturing specifications, the IMS QoS mechanism [13, 14, 15,16, 17] does contribute to traffic management in the contentnetwork. The typical scenario description and high level callflow of content sharing sessions are also provided.2 IMS based content network architectureFigure 1 illustrates the IMS based content networkarchitecture, which mainly takes specifications from 3rdGeneration Partnership Project (3GPP) [12, 13, 14] andTelecommunication and Internet converged Services andProtocols for Advanced Networks (TISPAN) [15] as thereference.Figure 1. IMS based content networking architecture213


2.1 CNFThe Content Navigating Function (CNF) providesinformation necessary to the user to select a content service.The functionalities of CNF are:1. Collecting the available content service informationthat the client can select.2. Providing UI functionality for the end user to querythe content service information.3. Present the content service information according tothe user preference.4. Navigate the address of content host or peer host inthe form of Uniform Resource Identifiers (URIs) orIP addresses.5. Provide other useful data of the content service.The CNF is a kind of SIP Application Server (AS) [11].2.2 CSCFThe Call Session Control Functions (CSCFs) [11, 12] are thecore elements to perform the session control in the IMSarchitecture. It can act as Proxy CSCF (P-CSCF), ServingCSCF (S-CSCF) or Interrogating CSCF (I-CSCF). The P-CSCF is the first contact point for the UE within the IMS; theS-CSCF actually handles the session states in the network; theI-CSCF is mainly the contact point within an operator'snetwork for all IMS connections destined to a subscriber ofthat network operator, or a roaming subscriber currentlylocated within that network operator's service area [12].In IMS based content network, CSCFs also provide sessioncontrol functionalities, such as session establishment,modification, termination for content sharing sessions.2.3 PD-FEThe Policy Decision Function Entity (PD-FE) [16] is a logicalpolicy decision element for service-based policy control. Itmakes policy decisions using policy rules defined by thenetwork operator according to the user’s subscription orrequest, and then maps the local policy into the parameters tobe sent to the Border Gateway Function (BGF). The PD-FEhides the underlying network topology from applications,which allows the PD-FE to offer a common view to the CSCFregardless of the underlying network topology and particularaccess technology in use.In IMS based content network, PD-FE makes policy decisionsfor content sharing sessions according to user’s subscriptionor request and asks BGF to enforce them.2.4 BGFThe Border Gateway Function (BGF) is a packet-to-packetgateway for user plane media traffic. The BGF performs bothpolicy enforcement functions and NAT functions under thecontrol of the PD-FE in each of the network segments: access,aggregation and core [15]. The BGF has a policy enforcementfunction that interacts through the Ia reference point with thePD-FE and is under the control of the PD-FE. The BGFoperates on micro-flows, i.e. on individual flows of packetsbelonging to a particular application session. The BGF’spolicy enforcement function is a dynamic gate that can blockindividual flows or allow authorized flows to pass. The PD-FE instructs the BGF to open/close its gate for the particularflow, i.e. to allow the admitted flow to pass through the BGF.Possible resources that are managed by the BGF include thehandling of a pool of IP addresses/ports and bit rate on theBGF interfaces.In IMS based content network, BGF enforces the policy fromPD-FE, i.e. allocate bandwidth needed for allowed mediaflows of content sharing sessions and block unauthorizedflows. Furthermore, BGF will report the bandwidth usagestatus to PD-FE. Essentially, QoS is guaranteed from this.2.5 HSSThe Home Subscriber Server (HSS) [12] is the main datastorage for all subscriber and service-related data of the IMS.The main data stored in the HSS include user identities,registration information, access parameters and servicetriggeringinformation. In addition to functions related to IMSfunctionality, the HSS contains the subset of Home LocationRegister and Authentication Center (HLR/AUC) functionalityrequired by the packet service domain and the circuit servicedomain.In IMS based content network, HSS stores the profile ofcontent network users, such as QoS parameters, servicetriggeringinformation, etc.2.6 Content/Peer HostContent/Peer Host [1] stores media content or other serviceinformation. Content/peer Host is responsible for deliveringcontent data to the client. The difference between content hostand peer host is that content hosts are several stable, reliableand powerful servers with large storage provided andmaintained by the service provider in some relativelycentralized sites, while peer hosts are mainly highlydistributed low performance personal computers.3 CNF communication mechanism3.1 CNF and Content/Peer hostWhen the Content/Peer host is online, it contacts CNF andreports its status and the information of services it can provide.The CNF will save the information and response an OK as theacknowledgement. SIP [18] can be used as thecommunication protocol. Peer hosts communicate with CNFmore frequently than content hosts since peers join and leavethe network randomly.214


3.2 CNF and CSCFCNF is an Application Server (AS) to CSCF. When requestedby the user, CSCF will query CNF for available contentservice information or available peer host information. CNFsends back information requested back to CSCF. They canuse SIP [18] protocol to communicate as defined in [11].4 Typical scenarios and Call Flows4.1 Peer-to-Peer content networkFigure 2 depicts the call flow of P2P IMS based contentnetwork. In P2P content network, since peers can startsessions directly without CSCF awareness, the CSCF can’tcontrol every session in signaling level. It’s transport layerentity, like BGF, to take the responsibility to manage thetraffic. When the user registers to the network, the BGF willallocate an amount of bandwidth for the user according to theuser’s subscription. The BGF will perform Deep PacketInvestigation (DPI) to guarantee the bandwidth allocated forthe user and make sure the bandwidth used by every usercan’t exceed the assigned value. This resource ration strategywill eliminate the unfairness that some hungry users consumelots of bandwidth, so that other users can only get servicewith poor quality or even get starved.1. When a peer host starts up, it will report to CNFits status and the information of service it canprovide.2. CNF responses OK and save this informationinto its local memory. Here CNF has similarfunctionality as Gnutella Host-Cache Server [1].3. The user equipment registers to the CSCF.4. CSCF queries Home subscriber server (HSS) toget the user’s profile, which includes QoSparameters, such as bandwidth subscription, etc.5. HSS sends the profile to CSCF as requested.6. CSCF asks PD-FE to authorize bandwidth forthe user as subscribed.7. PD-FE indicates the BGF to enforce the QoSpolicy for the user.8. BGF reports the result of policy enforcement toPD-FE.9. PD-FE reports the result of policy enforcementto CSCF.10. CSCF sends the registration response to the user.11. The user asks CSCF to get prospective peeraddress and related information.12. CSCF queries CNF for this.13. CNF sends the information to CSCF asrequested.14. CSCF sends back the requested information tothe user.15. The user contacts peer hosts to get serviceswanted.16. The peer hosts response and provide services tothe user, like content sharing.During the content sharing session, the BGF will performDeep Packet Investigation (DPI) to guarantee the bandwidthallocated for the user and make sure the bandwidth used bythe user within the value assigned.17. If BGF finds that the user is trying to use morebandwidth than assigned value, it will report thisto PD-FE.18. PDFE reports to CSCF that the user is trying touse more bandwidth.19. CSCF prompts the user to buy more bandwidth.4.2 Centralized content networkFigure 3 depicts the call flow of IMS based centralizedcontent network. In IMS based content centralized network,every session establishment must pass through CSCF. SoCSCF can control all sessions. When the user initiates thesession, CSCF will indicate PD-FE/BGF to allocate requestedbandwidth for the session. This QoS policy will override thepolicy enforced in user registration stage. Thus per-sessionbased dynamic QoS policy enforcement can be realized. It’sdifferent than P2P content network.Step 1~10 are similar as P2P content network. Following isthe description from step 11:11. The user asks CSCF for currently availablecontent service information.12. CSCF query CNF for this.13. CNF sends back the information to CSCF asrequested.14. CSCF sends back available content serviceinformation to the user.15. The user picks up a service and sends a sessioninitiation request to CSCF. It’s usually by a SIP[8] INVITE message.16. CSCF sends the session initiation request tocontent host that can provide the servicerequested by the user.17. The content host responses CSCF the servicerequest.18. CSCF asks PD-FE to authorize bandwidth forthe session.19. PD-FE indicates the BGF to enforce the newQoS policy for this session.20. BGF reports the result of policy enforcement toPD-FE.21. PD-FE reports result of policy enforcement toCSCF.22. CSCF sends the session initiation response tothe user.Then the content host can start the content delivery to the user.By enforcing QoS policy in step 19, BGF guarantee thebandwidth allocated for the session and can’t exceed it.215


User BGF PD-FE CSCF CNF(AS) HSSPeer Host1..nThis is a high level call flow(1) Report it’s status and service info(3) IMS register(2) Response OK(4) User profile query(5) User profile query response(6) Send user’s QoS policy to PD-FE(7) Indicate the BGF to enforce the policy(8) Report the policy enforcement result(9) Report the policy enforcement result(10) Registration response(11) Get prospective peer address/info(14) Response prospective peeraddress(15) Contact peer host to get services(12) Get prospective peer address/info(13) Response(16) Service providingBGF is performing DPI. Guaranteethe bandwidth allocated and limit thebandwidth usage within the valueassigned in step 7.(17) Report the user is trying to use more bandwidth(18) Report the user is trying to use more bandwidth(19) Prompt the user to buy more bandwidthFigure 2. Call flows of IMS based P2P content network216


User BGF PD-FE CSCF CNF(AS)HSSContent HostThis is a high level call flow(1) Report it’s status and service info(3) IMS register(2) Response OK(4) User profile query(5) User profile query response(6) Send user’s QoS policy to PD-FE(7) Indicate the BGF to enforce the policy(8) Report the policy enforcement result(9) Report the policy enforcement result(10) Registration response(11) Get content service info(12) Get available content service info(13) Response(14) Response with available content service info(15) Session Initiation request(16) Session Initiation request(18) makes resource request to PD-FE(19) Indicate the BGF to enforce the QoS policy(17) Session Initiation response(20) Report the policy enforcement result(22) Session Initiation response(21) Report the policy enforcement resultStart Content deliveryBGF is performingDPI. Guarantee thebandwidth allocatedand limit thebandwidth usagewithin the valueassigned in step 19.Figure 3. Call flows of IMS based centralized content network217


5 ConclusionThis paper proposed an IMS based content networkarchitecture and a new IMS Application Server (AS) -Content Navigating Function (CNF) dedicated for the contentnetworking to guarantee the QoS of the traffic. Mature IMSQoS mechanism can provide solid QoS assurance for contentnetworks. The typical scenario description and high level callflow of content sharing sessions are also provided.References[1] Markus Hofmann and Leland R. Beaumont.:“Content Networking: Architecture, Protocols, andPractice”, Morgan Kaufmann Publishers, 2005[2] Vakali, A. and Pallis, G. “Content delivery networks:Status and trends”. IEEE Internet Computing 7, 6(Nov./Dec. 2003), 68–74.[3] G. Pallis, and A. Vakali, “Insight and Perspectivesfor Content Delivery Networks,” Communications ofthe ACM, Vol. 49, No. 1, ACM Press, NY, USA, pp.101-106, January 2006.[4] Al-Mukaddim Khan Pathan and Rajkumar Buyya -"A Taxonomy and Survey of Content DeliveryNetworks", Technical Report, GRIDS-TR-2007-4,Grid Computing and Distributed Systems Laboratory,The University of Melbourne, Australia. 12 February,2007,[5] Barbir, A., Cain, B., Nair, R., Spatscheck, O.:“Known Content Network (CN) Request-RoutingMechanisms,” RFC 3568, July 2003.http://www.ietf.org/rfc/rfc3568.txt?number=3568[6] J. Kangasharju, J Roberts, and K. W. Ross,“Object Replication Strategies in ContentDistribution Networks,” Computer Communications,Vol. 25, No. 4, pp. 367-383, March 2002.[7] M Cieslak, D Foster, G Tiwana, and R Wilson,“Web Cache Coordination Protocol Version 2,”http://www.web-cache.com/Writings/Internet-Drafts/draft-wilson-wrec-wccp-v2-00.txt[8] Green, M., Cain, B., Tomlinson, G., Thomas, S., andRzewski, P. (2002). Internet Draft: Content InternetWorking Architectural Overview, IETF CDIWorking Group.[9] Rzewski, P., Day, M., and Gilletti, D. (2003).:“Content Internetworking (CDI) Scenarios” IETFCDI Working Group, RFC 3570, July 2003.http://www.ietf.org/rfc/rfc3570.txt?number=3570[10] Day, M., Cain, B., Tomlinson, G., Rzewski, P.: “AModel for Content Internetworking (CDI),” RFC3466, February 2003.http://www.ietf.org/rfc/rfc3466.txt?number=3466[11] 3rd Generation Partnership Project, “IP MultimediaSubsystem (IMS); Stage 2” 3GPP TS 23.228,http://www.3gpp.org/ftp/Specs/html-info/23228.htm[12] 3rd Generation Partnership Project, “NetworkArchitecture” 3GPP TS 23.002,http://www.3gpp.org/ftp/Specs/html-info/23002.htm[13] 3rd Generation Partnership Project, "End-to-endQuality of Service (QoS) concept and architecture",3GPP TS 23.207,http://www.3gpp.org/ftp/Specs/html-info/23207.htm[14] 3rd Generation Partnership Project, 3GPP TS 23.107:"Quality of Service (QoS) concept and architecture",3GPP TS 23.107,http://www.3gpp.org/ftp/Specs/html-info/23107.htm[15] European Telecommunications Standards Institute,“Resource and Admission Control Sub-System(RACS) – Functional Architecture,” ETSI ES 282003, http://www.etsi.org[16] International Telecommunication Union,Telecommunication Standardization Sector,“Functional Architecture and Requirements forResource and Admission Control Functions in NextGeneration Networks,” ITU Y.RACF,http://www.itu.org[17] 3rd Generation Partnership Project, “End-to-EndQuality of Service (QoS) Signalling Flows,”3GPP TS 29.208,http://www.3gpp.org/ftp/Specs/html-info/29208.htm[18] Rosenberg, J., Schulzrinne, H., Camarillo, G.,Johnston, A., Peterson, J., Sparks, R., Handley, M.,and E. Schooler.: "SIP: Session Initiation Protocol",RFC 3261, June 2002.http://www.ietf.org/rfc/rfc3261.txt?number=3261Abbreviations, Acronyms, and Terms3GPP*—3rd Generation Partnership ProjectAS— Application serverBGF— Border gateway functionCDI — Content distribution internetworkingCDN — Content delivery networkCNF— Content navigating functionCSCF—Call session control functionDPI— Deep packet investigationETSI—European telecommunications standardsinstituteHLR/AUC—Home location register and authenticationcenterHSS — Home subscriber serverIMS—IP multimedia subsystemIP—Internet protocolITU—International telecommunication unionP2P—Peer-to-peerPD-FE—Policy decision functional entityPDF—Policy decision functionQoE — Quality of experienceQoS — Quality of serviceRFC — Request For CommentsSDP—Session description protocolSIP—Session initiation protocolSLA— Service level agreementTISPAN—Telecommunication and Internet convergedServices and Protocols for Advanced NetworksUE — User equipmentURI—Uniform resource identifier218


The Enhanced Dv-Hop Algorithm in AdHoc NetworkZhang Pin, Xu ZhifuSchool of Communication Engineering, Hangzhou dianzi UniversityHang Zhou, Chinazhangpin@hdu.edu.cn qzxuzhifu@163.comAbstract: Dv-Hop is one of the Range-Free Algorithms forWireless Sensor Networks. While lanchor nodes density islow Dv-Hop shows bad average error and poor Stability.This paper proposes an Enhanced Dv-Hop Algorithm(EDVA)in order to improve the performance with low anchor nodesdensity. The main idea of EDVA is setting anchor nodes atthe boundary to reduce the average error. Throughsimulation, the EDVA has better performance in averageerror and stability than Dv-Hop.Keywords: WSN; Dv-Hop; Average error; Improved; DensityI. INTRODUCTIONAd-hoc wireless sensor networks have manyattractive applications. While located in differentenvironment, to determine the node location isfundamental to these applications. Although the locationinformation can be obtained by GPS system with a goodprecision, it is obviously not a scalable solution because itis very expensive to implement GPS for a large number ofnodes [1] .In an ad-hoc localization system, nodes determinetheir position in a common coordinate system using anumber of anchor nodes that already know their location(through some external means, such as GPS in thatcoordinate system). These systems assume all nodespossess a ranging capability (the ability to estimatedistances to other nodes). Using their range estimates andseveral distributed position fixing techniques to determinetheir positions in the coordinate system [2] .Usually radiolocation approaches are applied toad-hoc sensor networks. it face several challenges [3] :sparse reference points, limited ranging accuracy, and theneed for low-power consumption. The number of anchornodes, or nodes with a prior knowledge of their locationsrelative to a global coordinate system, are assumed to belimited. The other nodes, their communication range islimited to their immediate neighborhood, have to connectto the anchor nodes hop by hop and estimate the distanceto it, and then calculate the coordinates by performingtraditional triangulation algorithms. So the place ofanchor nodes is important to reduce the error distantestimation.I. II. DV-HOP ALGORITHMDv-Hop algorithm [4-5] is one of the APS distributedlocalization algorithms which mainly relies on thedistance vector routing between the unknown nodes andanchor nodes, it is easy to be achieved. With moderateaccuracy, it is a good choose for Wireless Sensor Networkthat have a limited hardware support.Dv-Hop algorithm can be divided into three phases:computing the smallest hop between unknown nodewith each anchor node.First , using a typical distance vector exchangeprotocol to get hops between all nodes with anchor nodes.Anchor nodes broadcast the packet which including hopsinformation to neighbor nodes, and the receive nodesrecord the smallest hop to every anchor node, ignoringthe more large hops from the same anchor node, thenincrease 1 hop and transmit to neighbor node., so eachnode can get path with minimal hops to each anchor node.Then computing the true distance between unknownnode with anchor node based on the hops and path lengthfrom this node to anchor nodes.Based on the anchor node location informationrecorded by the first phase, using formula (1) to estimatethe average distance of each hop.The study is sponsored by Nature Science Foundation of Zhejiang Province (R105473)219


d( X − X ) + ( Y −Y)= ∑ ∑2 2i j i jhiji ≠ j (1)Where d is the average distance of each jump,( X , Y ) is the true coordinate of anchor node i,iilength of the hop between node i with node j, while link (i,j) belongs to the path with minimal hops betweenordinary node and an anchor node. We get,WhereD iD d*hi=ih ijis(2)is the estimated distance from unknownnode to anchor node, d is the average distance of eachBy using MMSE we can get the unknown nodescoordinates, T −1TX ( A A)A b= (7)Anchor nodes are randomly generated in the firstphase of DV-Hop algorithm . Through simulation, wefind that when the anchor node density reach 10%(random arrangement 100 nodes which including 10anchor nodes in a 100*100 regional network) in anetwork, positioning error accuracy can achieve 30%.Although DV-Hop is easy to implement, it has thedisadvantage that the positioning accuracy will decreasequickly when the anchor node density or networkconnectivity drops.jump,h iis the hops of all unknown node to anchorIII An Enhanced Dv-Hop algorithmnode i.Finally, By using maximum likelihood principleestimation when the unknown nodes get three or moreanchor nodes, such as formula (3), we can fix nodelocation.2 2⎧ ( X − X1) + ( Y − Y1)= D⎪2 2⎪( X − X2) + ( Y − Y2)= D⎨⎪⎪ 2 2⎩( X − Xn) + ( Y − Yn)= D21222n(3)Where ( X , Y ) is the coordinate of unknown node,( X , Y ) is the true coordinate of anchor node i,iithe estimated distance from unknown node to anchornode. We write it in the form of matrix.WhereAXD i= b(4)⎧ X1−XnY1−Yn⎫⎪⎪A = 2* ⎨ ⎬(5)⎪X n−1−Xn Yn−1−Y⎪⎩n⎭2 2 2 2 2 2⎛ X1 − Xn + Y1 − Yn + Dn−D⎞1⎜⎟b = ⎜ ⎟⎜ 2 2 2 2 2 2Xn−1− Xn + Yn−1− Yn + Dn −D⎟⎝n−1⎠(6)isDv-Hop generates anchor nodes randomly in the firstphase of DV-Hop. If the distribution of Anchor nodes isnot even, some nodes will suffer from bad accuracy. Thatmeans Dv-Hop is not stable just as that the simulation hasproved. This paper proposes Enhanced Dv-Hop algorithmthat sets anchor nodes at the boundary. EDVA has twoadvantages:(1)it reduces the hops variance of wholenetwork that finally reduce the error of estimated distanceefficiently and improve the accuracy stablity; (2) Itreduces the ranging times so as to reduce the powerconsumption. EDVA is divided into threephase:(1)select four border anchor nodes, computing thesmallest hop between unknown node to each anchor node;(2)computing average per hop distance for anchornode;(3)After unknown node get distance to each anchornode, then using maximum likelihood principle tocalculate coordinates.We use Matlab to make simulation on EDVA. At the100x100 square, randomly set 100 nodes which includingfive anchor nodes, ranging field is 50, Figure 1 shows thenode distribution. The stars are anchor nodes and theothers are ordinary nodes.When simulating EDVA, we set five anchor nodes inthe 100x100 square region, and randomly set 92 unknownnodes. Anchor nodes located at center and four corners.Figure 2 shows the node distribution.220


Figure 1: Node distribution for Dv-Hop AlgorithmFigure 2: Node distribution for improved Dv-Hop AlgorithmIV. Simulation AnalysisWe compares average error between thetraditional Dv-Hop and the improved Dv-Hop algorithm.The average error is defined asaverageerror =n∑i=1( Xˆ− X) + ( Yˆ−Y)in2 2iWhere n is the number of unknown nodes,Yˆiare the estimated coordinates of unknown node,are the true coordinates of unknown node.Y iX ˆi、Xi、We carried out 20 tests for each algorithm. Figure 3shows results of Dv-Hop algorithm and EDVA.. The linewith stars is DV-Hop and the line with Diamonds isEDVA.Figure3: Comparison between Dv-Hop algorithm and EDVAThe figure 3 shows the error ranges of DV-Hops isabout 15%-30% while EDVA shows better error rangingfrom 12%-15%. EDVA can achieve better accuracy evenwith lower anchor density which means less powerconsumption. On the other hand, setting the anchor at theboundary will reduce the variance the hops of ordinarynodes to the anchor nodes, and finally achieve betteraccuracy, stability.V. ConclusionThe Dv-Hop algorithm has good scalability andmoderate positioning accuracy, but in the low anchornode density network, it shows poor stability and positionaccuracy. This paper proposes Enhanced Dv-Hopalgorithm which can achieve better positioning accuracyand better stability even with lower anchor node density.REFERENCE[1] N. Bulusu, J. Heidemann, D. Estrin, “GPS-less low costoutdoor localization for very small devices”, IEEE PersonalCommunications Magazine, 7(5)(2000): 28–34.[2] We T, Huang C. “Range-free localization schemes for largescale sensor networks”[C].<strong>Proceedings</strong> of the ninth AnnualInternational conference on mobile computing andnetworking. San Diego, california, September 2003:81-95.[3] N. Patwari , J.N.Ash , S. Kyperountas etc. “Locating theNodes: Cooperative localization in wireless sensor networks”.IEEE Signal Processing Magazine, Vol 22, 2005[4] D. Niculescu and B. Nath, "Ad hoc positioning system(APS)," Proc. Of GLOBECOM '01. IEEE, San Antonio, TXUSA, 2001: 2926-2931.[5] Niculescu D, Narh B. “DV-based positioning in ad hocnetwork”. Telecomm. Systems, 2003, 22(1): 267~280.221


Program Dependence Graph Generation and Its Use inNetwork Applications AnalysisJing Huang, Xiaojun WangSchool of Electronic EngineeringDublin City UniversityIrelandjing@eeng.dcu.ie, xiaojun.wang@dcu.ieAbstract—Program Dependence Graph (PDG) is anIntermediate Representation used by compilers to characterizethe control-flow and data-flow dependences. It is widely usedin optimizations for parallel processors systems. This paperdemonstrates an implementation of the PDG generation as acompiler pass and the use of PDG in profiling networkapplications. The paper also introduces our ongoing and futurework on using PDG in the research of application partitioningand mapping problem.Keywords-Program Dependence Graph; DependenceAnalysis; Network ApplicationI. INTRODUCTIONIn the internal work flow of compilers, an IntermediateRepresentation (IR) is a data structure used to collect theinput information, e.g. the semantics of C code. Most of thecompiler optimizations would conduct upon a specific kindof IR. Classical examples of IRs include the Control FlowGraph (CFG) built for flow analysis, Abstract Syntax Tree(AST) employed in syntax-directed translation etc.Particularly in our interest of task allocation problem fornetwork processor systems [1], the compiler needs tocharacterize the dependence profile of an application.Previous researches have employed Annotated DirectedAcyclic Graph (ADAG) [2], basic-block based task graphs[3], general analytical model [4], etc. to represent theapplications. However, these representations are generallyderived from runtime traces of the network applications.From the compilers’ perspective, they are not directlyapplicable. Rather, efficient representations of static profilingresults are required during compilation. In [5], ProgramDependence Graph (PDG) was used as the IR to staticallycharacterize the applications and was fed into the taskpartitioning algorithm. The PDG explicitly expresses thedependences of a given program in a graph, and implicitlyindicates the opportunities for code parallelization. Given itsgreat features as an IR, it can be used extensively in compileroptimizations for parallel systems like network processorssystems. Hence, we implemented a compiler pass thatefficiently generates the PDG in Machine SUIF [6] compilerinfrastructure. This paper summarizes the work of the PDGpass implementation and demonstrates its use in networkapplications analysis.II. DESIGN AND IMPLEMENTATIONThe PDG consists of two sub-graphs, i.e. ControlDependence Graph (CDG) and Data Dependence Graph(DDG). CDG expresses the Control Dependence while DDGdepicts the Data Dependence. We give the terminologies ofthe graphs and the implementation issues respectively.A. Terminologies of PDGPDG is a graph IR that is strongly related to the conceptof Control Flow Graph (CFG), the classical graph IR. As thesame in CFG, the instructions are grouped together at theBasic-Block (BB) level. In a BB, the first instruction is theonly entry point in the control flow, while the last instructionis the only exit point. Thus CFG represents the control flowwith its nodes being BBs and its edges being the path of thecontrol flow.The very basic kind of CDG is also composed of BBs.However, its edges now represent the Control Dependences(CD). CD is an abstraction of the execution order. Forexample, node x in the CFG (i.e. Basic Block x) ends with abranch instruction and hence has two paths at the exit pointof the node. If node y (i.e. Basic Block y) will be executedonly when the control flow goes through the true path at theexit of node x, we say the node y is Control Dependent onnode x on the true edge. Correspondingly in the CDG, adirected edge is added from node x to y, labeled with acontrol condition, e.g. true in this example. After this initialgeneration of CDG, the so-called region nodes are inserted inthe second phase to represent a set of control dependences.For instance, if node y is control-dependent on the node a onthe true edge and on the node b on the false edge, we create aregion node R1 to hold the control dependences of .The node y is made to be control dependent on the newlycreated region node R1 only.As for the DDG, its nodes are still BBs. The edges nowrepresent the data dependences. If an instruction in basicblock y uses a variable that is defined in basic block x (i.e. adef-use chain exists between different basic blocks), wedefine that there is a data dependence edge between x and y.It is also assumed that the weight of the edge is same withthe number of such def-use chains across two basic blocks.In the DDG, the edges are labeled with the weight.The PDG then can be easily constructed by combiningthe CDG and DDG together. Since the nodes of both graphs222


Figure 1. Class Design of PDG Passare largely the same (i.e. BBs), with a few additional regionnodes in CDG, the merge process is straightforward.B. Implementation of the PDG PassThe algorithm of PDG generation is adopted from [7] andwas implemented in Machine SUIF compiler infrastructure.Consistent with other built-in SUIF passes, PDG pass isdesigned in an object-oriented pattern. The design model ofthe classes is given by Fig. 1. By construction, the PDGnodes could be classified further into types of entry node,statement node, predicate node and region node. Wemodelled each of them in an inherited sub-class of the parentclass, i.e. Pdg_node. The dashed lines labelled with“refines” in the figure represent this relationship. TheCDset class models an arbitrary set of control dependencesand is used by the Pdg_node class to represent the controldependences of a given node. Finally the Ddg_edge classgives the data dependence information between PDG nodes.It has a weight property to indicate the number of def-usechains, as explained earlier. We do not need a class to modelthe CDG edges since the control dependences are implicitlyincluded in the Pdg_node class, specifically by theproperties of _parents and _children in the class.The class methods of PDG, namely generate_CDGand generate_DDG, output the graph results both in apure text and in graph description formats (i.e. .dot files).The description formats files can be fed into Graphviz [12] togenerate the actual image files, e.g in JPEG or GIF format.C. Lessens LearnedSUIF defines an Optimization Programming Interface(OPI) for developers to add their own passes. Abiding bythese OPI, we are able to separate the algorithm details fromthe substrate IR (i.e. SUIF IR); thus the portability of codeand productivity of coding are both enhanced. SUIF is alsopackaged with several built-in libraries facilitating controlflowand data-flow analysis. Making use of these libraryfunctions greatly reduced the workload of trivialimplementations. For example in data-dependence analysis,the Single Static Form (SSA) library was hired to directlygive the def-use chains and we are only left with the work ofassembling that information in the PDG form.III. RESULTSThe PDG generator pass was run on a set of networkapplication benchmarks to testify the validity of the pass andto collect the program dependence information.Firstly we look at a code segment for checking thepackets’ integrity, namely the check_sum function. It isone of the most common operations in packet processingsystems. The procedure of check_sum is to calculate the1’s complement sum over the packet header octets. It returnstrue if the results is all 1 bits. The CFG of the function andits corresponding CDG and PDG output by our generator aregiven in Fig. 2. In the PDG, the edges in solid lines are CDGedges while those dashed lines are DDG edges. The roundvertices represent the statement nodes and diamonds standfor predicate nodes. These two types of nodes are also theBBs derived from CFG nodes containing instructions. Thepentagonal vertices are region nodes that summarize a set ofcontrol dependences as explained earlier. By the nature ofCDG, the set of nodes that are control-dependent on thesame node, such as node 1 and node 6 in the Fig. 2, could beexecuted in parallel, as long as they do not entail any datadependences.A set of tests consisting of several sample code snippetswere also conducted. The results were validated bycomparing the generated PDG against those reported in [5][7] and some in compiler textbooks. These tests are notnecessarily all relevant to the network applications, but thecomparison results ensured the validity of our PDG pass ingeneral.A. An ExampleWe demonstrate a concrete example by running the passon a trie-based ipv4-packet forwarding application. The IPv4223


CDGPDGFigure 2. Program Dependence Graph Exampleforwarding code was adopted from Packetbench [9]. In orderto generate the PDG of the whole ipv4-packet forwardingapplication, we inlined all the functions. It is common to doso for network applications, since the applicationsthemselves are usually small in C code size.The major procedures of IPv4 forwarding includebuilding a route table during system initialization; checkingthe packet type (dropping non-IP packet); validating theintegrity of the packet; checking Time To Live (TTL) fieldand decrementing it; updating the checksum; and finallylooking up the destination address in the route table todetermine the next-hop port. In our experiment, after inliningall the major functions, the C code is lowered down to SUIFIR and then transformed to CFG IR. And then our generatorpass takes the CFG IR as the input and generate the PDG ofthe whole application as the output. Fig. 3 captures thesnapshot of running these steps in a Linux box.Figure 3. Steps of Running PDG PassFig. 4 illustrates the generated PDG of the whole packetforwarding application. The graph exposes clear hierarchy ofcontrol dependences. For example, predicate nodes 4, 6, 9,12, 15, 18, 21, 24 and their respective children nodes are allcontrol-dependent on entry node, and have no remainingentangling control dependence edges among each other. Itmeans the paths (e.g. from node 9 down to node 11 in thefigure) could be well grouped together and runindependently on one processor. The communication cost,though, is given by the data dependences edges (i.e. thedashed lines) that connecting any node on the path.IV. RELATED AND FUTURE WORKPrevious researchers have employed PDG in variousways in static program analysis. In [10], Gong et al. alsoconstructed PDG in SUIF compiler to facilitate the logicsynthesis. Due to their special application domain, their PDGdata structure was different, with the SSA form incorporated.Rather in our approach, we directly made use of SSA tocollect data dependences. The Linda Compiler, a precursorin developing language support for parallel systems, alsoexplored SUIF to generate PDG for its internal work flow[11]. Their approach is close to ours except that theirintended use of PDG was for message communication indistributed-memory systems. Moreover, the Linda Compilerwas based on the old SUIF1 that is superseded by the newer224


Figure 4. PDG of IPv4 Packet ForwardingSUIF2 we employed. The two compiler frameworks are notcompatible and according to SUIF group’s documentation[8], SUIF1 is less flexible in modular design and code-reuseetc. We believe our contribution is more applicable fortoday’s use.Our ongoing work on network application partitioningand mapping for the network processors systems will makeextensive use of the PDG generated by this SUIF pass. In [5]an algorithm adopted from Min-Cut Max-Flow problem wasimplemented to take the PDG as the input graph and regardthe weight of the edges as the flow capacities in the Max-Flow problem. It aimed to minimize the communication cost(including both control dependence and data dependence)among the partitions and balance the resource utilization ofthe network processors. We plan to investigate otherheuristics solving the partitioning and mapping problem forthe network applications, and take other performance metricsinto consideration.Besides, the PDG could be used in other compileroptimizations such as efficient data mapping in presence ofcache system, branch speculation and loops optimizations.We will experiment to verify their validity in networkprocessors systems.V. CONCLUSIONIn order to perform certain analysis and optimizations incompilers, an efficient representation that explicitly capturesthe control-flow and data-flow dependence information ofthe source code is well needed. Program Dependence Graphis an example of such representation. In this work, wedesigned and implemented a compiler pass in Machine SUIFinfrastructure that generates the Program Dependence GraphIR. The generated PDG was used to analyze the dependencehierarchy of network application benchmarks. The output ofour pass could also be fed into Graphviz to get visualizedimage. In the future, the PDG will be input into theapplication partitioning and mapping algorithms to evaluatethe performance of different partitioning and mappingheuristics.ACKNOWLEDGMENTThis work was funded by the Irish Research Council forScience, Engineering and Technology (IRCSET) under thePostgraduate Scholarship Scheme.REFERENCES[1] Q. Wu and T. Wolf, “On runtime management in multi-core packetprocessing systems,” <strong>Proceedings</strong> of the 4th ACM/IEEE Symposiumon Architectures for Networking and Communications Systems, ACMNew York, NY, USA, 2008, pp. 69-78.[2] R. Ramaswamy, N. Weng, and T. Wolf, “Application analysis andresource mapping for heterogeneous network processorarchitectures,” Network Processor Design: Issues and Practices, vol.3, 2005, pp. 277–306.[3] J. Yao, Y. Luo, L. Bhuyan, and R. Iyer, “Optimal network processortopologies for efficient packet processing,” IEEE GlobalTelecommunications Conference, 2005. GLOBECOM'05.[4] X. Huang and T. Wolf, “Evaluating Dynamic Task Mapping inNetwork Processor Runtime Systems,” IEEE Transactions onParallel and Distributed Systems, vol. 19, 2008, pp. 1086-1098.[5] J. Yu, J. Yao, L. Bhuyan, and J. Yang, “Program mapping ontonetwork processors by recursive bipartitioning and refining,”<strong>Proceedings</strong> of the 44th annual conference on Design automation,San Diego, California: ACM, 2007, pp. 805-810.[6] M.D. Smith and G. Holloway, “An introduction to Machine SUIF andits portable libraries for analysis and optimization,” Division ofEngineering and Applied Sciences, Harvard University, 2002.[7] J. Ferrante, K.J. Ottenstein, and J.D. Warren, “The programdependence graph and its use in optimization,” ACM Trans. Program.Lang. Syst., vol. 9, 1987, pp. 319-349.[8] http://suif.stanford.edu/suif/suif2/doc-2.2.0-4[9] R. Ramaswamy and T. Wolf, “PacketBench: a tool for workloadcharacterization of network processing,” Workload Characterization,2003. WWC-6. 2003 IEEE International Workshop on, 2003, pp. 42-50.[10] W. Gong, G. Wang, and R. Kastner, “A High PerformanceApplication Representation for Reconfigurable Systems,” Intl. Conf.on Engineering of Reconfigurable Systems and Algorithms (ERSA),Las Vegas, NEV, USA, 2004.[11] J. Fenwick and L. Pollock, Implementing an optimizing lindacompiler using suif, 1996.[12] http://www.graphviz.org225


Section 6ADIGITAL HOLOGRAPHY226


Segmentation and three-dimensional visualisation of digital in-line holographicmicroscopy dataKaren M. MolonyDepartment of Computer ScienceNational University of Ireland MaynoothMaynooth, Co. Kildare, Irelandkmolony@cs.nuim.ieThomas J. NaughtonDepartment of Computer ScienceNational University of Ireland MaynoothMaynooth, Co. Kildare, IrelandandUniversity of OuluRFMedia LaboratoryOulu Southern Institute, Vierimaantie 5, 84100 Ylivieska, Finlandtomn@cs.nuim.ieAbstract1. IntroductionThis paper demonstrates that transmissive or partiallytransmissive scenes imaged by digital in-line holographicmicroscopy (DIHM) can be reconstructed as a threedimensional(3-D) model of the imaged volume from a singlecapture. This process entails numerical reconstruction,segmentation and polygonisation. Numerical reconstructionof a digital hologram captured using a DIHM set upis performed at equally spaced depths within a range. Inthe case of intensity modulating objects, segmentation ofeach of the reconstructed intensity images produces a contourslice of the scene by applying an adaptive thresholdand border following. These slices are visualised in 3-D bypolygonising the data using the marching cubes algorithm.We present experimental results for a real world DIHM captureof a partially transmissive scene that demonstrates thesteps in this process.Keywords: Digital in-line holographic mircroscopy, segmentation,three-dimensional visualisationHolography [1, 2] is an imaging science made up of twoparts, recording and replay. Traditionally, photographicfilms were used to record holograms. A hologram encodes3-D information; intensity and directional information ofthe optical wave-front. Digital holography is derived fromconventional holography but uses a digital area sensor insteadof a photographic medium in order to capture theholograms, and reconstruction is performed on a computernumerically [3, 4]. This has only recently become feasibledue to advances in computer technology and CCD sensorswith high spatial resolution and high dynamic range. Themicroscopic principle originally proposed by Gabor [1] isthe simplest realization of holography and has been coineddigital in-line holographic microscopy (DIHM) [5]. It is thisoptical recording set up that we use in this paper.Typical microscopy approaches, e.g. confocal microscopy,require dyes to make (quasi-)transparent biological samples,which are compressed between glass slides, visible.DIHM enables biological specimens to be analyzed at a cellularlevel in a completely unaltered environment in 3-D.Manual analysis of this data by a biologist is a tedious process.A further difficulty is introduced when using DIHMdata as DIHM is an emerging technology with which biologistsare not yet familiar. This is compounded by the227


away. Immediately behind this plane there is an objectfact that the DIHM data is 3-D data. Extraction of 3-D datafrom a pinhole illuminating an object, f(x), a distance d 1 example the DC term and twin image [16], speckle noisefrom DIHM data and subsequent analysis is an unsolvedproblem.wave, o(x) = f(x)r(x). The interference pattern, H(x ′ ),between the propagated reference wave, R(x ′ ), and thepropagated object wave, O(xIn this paper we focus on visualising 3-D features of scenes), is captured on a CCD afurther distance dimaged by DIHM. In order to achieve this, reconstructions2 away. This capture is the input to thenumerical reconstruction part of the imaging system.at various depths are obtained. A two part segmentationprocess is applied. An adaptive thresholding step, and thena border following step comprises the two-dimensional (2-D) segmentation that is applied to each of these reconstructions.Reconstruction of a capture obtained by this set up is possibleon a computer by numerically calculating a diffractionintegral that describes the diffraction in free spaceThe marching cubes algorithm is then implemented by the recorded hologram [7]. The sampling conditionson the multiple segmented images to polygonise the dataset.This renders the surface of the volume that was imagedand so a 3-D segmentation and visualisation technique forDIHM is presented.for the Fresnel integral have been formalised [8, 9] andnumerical approximations of the Fresnel transform (FST)have been applied successfully for digital hologram reconstruction[9, 10, 11]. Some fast algorithms for calculatingfree space Fresnel diffraction patterns have been developedIn section (2) an overview of the DIHM set up and reconstructionprocess is described. Further explanation of this[12, 13] and are applied for reconstructing the DIHM hologramsin the experiments described.type of optical setup can be found in [6]. The segmentationapproach used is detailed in section (3). In section (4)a description of the marching cubes algorithm is provided.We present some experimental results in section (5) and we3. Segmentationconclude in section (6).In image processing, segmentation subdivides an image intoregions. In digital holography, not all parts of a 3-D imagedobject will necessarily be in-focus in a given 2-D re-2. Digital In-Line Holographic Microscopyconstruction at a specific depth. Segmentation can be performedon a single reconstruction at a specific depth, or onmultiple independently focused reconstructions at differentdepths [14], which, when combined can be considered a topographyof the scene. In conventional digital holographya range of perspectives allow manipulation of the scene toovercome occlusions [15]. In DIHM such a range is notavailable due to the proximity of the components in the setup.However, as DIHM is applied to transparent objects,retrieving segmentations at multiple depths can be achievedwithout the obstruction of occluding features. At any givendepth, a reconstruction will show features in-focus at thatFigure 1. DIHM set up is physically comprisedof a light source, pinhole, sample, CCD and acomputer for numerical processing.depth and all out-of-focus features with reduced clarity [14].Applying an adaptive threshold to filter out-of-focus featuresin an intensity reconstruction of a DIHM capture allowsa successful application of border following.The first of the two stages of holography involves therecording of an interference pattern from an object beamand a reference beam. The second stage is replaying the 3.1. Adaptive Thresholdwavefront of the original object from this recording. Theoutput of the numerical reconstruction is typically a 2-D descriptionof the wavefront at a single distance from the cam-In order to remove out-of-focus background features froma reconstructed intensity image a threshold can be applied.era. The DIHM set up requires only a point light source, aIt is assumed that the in-focus features are brighter with respectthe background than out-of-focus features. A straight-pinhole, a transmissive or partially transmissive scene to beimaged, and an intensity recording device. This is suitableforward threshold is not as applicable if the backgroundfor biological samples [5] which are (quasi-)transmissive.is not even. There is ample opportunity for noise to enhanceAs shown in Fig.1, a spherical diverging beam r(x) emergesan uneven background in a DIHM reconstruction, for228


[17], and out-of-focus features themselves can influence thebackground. Therefore an adaptive threshold [18] is applicable.An example of adaptive thresholding is shown inFig.2. OpenCV [19] provides an implementation of thisFigure 3. Border following applied to the binaryinput on the left with a size restrucutionresults in the image on the right.Figure 2. Adaptive thresholding applied tothe text on the left with an inconsistent backgroundproduces the binary result on theright hand side.which we apply in the grayscale region {0-255} on blocksizes ≈ 2% of our image using mean weighting of pixels.The variable threshold is the weighted average of pixelswithin a block minus an offset. This provides a binary imageof foreground and background features.3.2. Border followingBorder following is an algorithm that takes a binary imageas input and outputs the borders of the objects and holeswithin those objects in the image. By indexing into the2D raster, each pixel is checked to see if it is a foregroundpixel. Once such a boundary pixel is found, i.e. the previouspixel was a background pixel, its connected pixels are examined.Usually the next pixel to be checked is in the samedirection that has just been found. This process is repeatedfor each subsequent connected neighbour pixel, tracing theboundary until no more connected border pixels are found.Typically boundaries of the object are assumed to be in 8-connectivity and boundaries of holes to be in 4-connectivity[20] and so can be treated disinctly in the output. An exampleof border following is shown in Fig.3.OpenCV provides an implementation of this based on [20].The output is a contour, or a list of points that comprise acurve in an image [19]. This list can be manipulated to onlyconsider contours in a given range of lengths, using a-prioriknowledge of saught features to limit the number of falsefeature boundaries isolated.4. 3-D visualisationAs segmentation is applied at a range of depths, a stackof binary contour images can be obtained and a volumeis constructed from these. The data is now represented byslice segmentations through the scene. There are variousscanning technologies that obtain data which is stored in asimilar manner, e.g. confocal miscroscopy and magneticresonanace imaging (MRI). The significant difference withDIHM data is that the slices are all obtained from a singlecapture and the volume is not physically scanned. Thereforemovement of a sample is not a consideration. Furthermoreno alteration of the sample is required, e.g. in confocal microscopyfluroescence or chemical dyes are required. Some3D visualisation techniques of data stored in this way beenexplored [21]. Marching cubes [22] is an algorithm that wasdeveloped specifically for visualising this type of data andhas proved effective for the polygonisation of similar scalardata [23].4.1. Marching CubesMarching cubes [22] is an algorithm that generates trianglesto represent surfaces. A 3D raster is subdivided into cubes.Each vertex of the cube is numbered 0−7 as shown in Fig.4.Figure 4. A volume is subdivided into cubes.The vertices of such a cube are numbered0−7Based on the values of the vertices of the cube with respectto an isosurface value, that cube is considered to be whollyinside or outside of a surface, or intersected by a surface. Ifsome vertices are higher than the isosurface value and someare lower then trangles are drawn on the edges of the cube to229


While this is not ideal as hair is opaque, this hologram doesallow for the demonstration of the process as the opaquesample is small with respect to the imaged scene.A digital hologram of two hairs at one depth from the camera,and a single hair at a different depth were imaged bythe DIHM process decribed previously as shown in Fig. 7.The scene can be reconstructed numerically from this sin-Figure 5. Triangles drawn for a cube whereonly vertex 3 and vertex 5 are below the isosurfacedemonstrate the intersections. There are 2 8 possiblities andso a known look up table is used which returns a twelve bitresult, one for each edge of the cube, to show intersectionson this cube. A look up table of corresponding triangles isalso provided for computational efficiency. An example isshown in Fig.5 where vertex 3 and 5 are below the isosurfaceand the resulting intersecting triangles are drawn.Figure 7. Hair hologram captured using aDIHM set upgle capture so that the foreground cross hairs and the singlebackground hair can be displayed in focus seperately. InFig. 8.A and 8.B numerical reconstructions of the scene arepresented at a reconstruction depth of 235 mm and 300 mmrespectively. Note that the reconstruction distances are thephysical distances magnified. A human hair is ≈ 100 µmthick. Reconstructions for depths ranging from {200−335}Figure 6. For a stack of a repeated contourimage, above, a 3-D rendering can be constructedof the surfaces and viewed from differentperspectives, below.An example of marching cubes applied to a stack of thesame input binary image is shown from two different perspectivesin Fig.6.5. Experimental ResultsIn this paper we detail the steps involved in the segmentationand visualisation of transmissive or partially transmissivescenes. These steps are illustrated here using a hologramof a cross hair sample captured using a DIHM set up.Figure 8. A. Reconstruction at depth = 235mm, B. reconstruction at depth = 300 mmmm in steps of 5 mm were computed. the results of applyingan adaptive threshold to the images shown in Fig.8are shown in Fig.9. Following on from this, border followingwas applied for each thresholded result and the resultingcontours were saved individually for further processingwhere each result represents a contour slice throughthe scene at that reconstruction depth. Examples of con-230


Figure 9. A,B are thresholds of Fig.8.A,Btour slices corresponding to the reconstructions shown inFig.8 are shown in Fig.10. Since we only consider contours≥ 1000 pixels, noise arising from out-of-focus features isomitted from the contour slice.Figure 11. Two different perspectives of a 3-Dvisualisation of contour slicesFigure 10. A,B are contours of Fig.9.A,BThe previously computed stack of contour slices comprisesthe 3-D input raster for the marching cubes algorithm. Asthe experimental sample is non-transmissive, contours areof the hairs in-focus and nearly in-focus, and so a 3-D typerepresentation can be made. However, for a transmissivesample contour slices are of in-focus features only throughthe scene so a 3-D model representative of the volume canbe obtained. As can be seen in Fig.11 the polygonsied surfaceof the scene can be viewed from any angle. The crosshairs at one plane, intersecting on the left hand side in bothviews in Fig.11, can be seen at a different depth than thesingle hair by looking at the z-axis shown.As mentioned, the contour slices here are for the in-focusand nearly in-focus hairs and so the 3-D representation isnot showing a 3-D model of the hair but rather the outlineof those in the x-y plane at a range of depths. It is expectedthat a 3-D visualisation of a transmissive biological samplewould appear more like the test example shown in Fig.12 where each contour slice represents the exact boundariesof in focus features at that depth. The test sample is constructedfrom a stack of the same 2-D segmentation of ahologram of a 2-D sample of mammalian cells on a glassslide which is shown in the top left corner.6. ConclusionsWe have shown that multiple 2-D segmentations of a sceneat different depths can be obtained from a single hologramof a transmissive or partially transmissive scene. These2-D segmentations can be used to then model a 3-D surfaceof the imaged sample. This is achieved by first applyingan adaptive threshold to numerical reconstructions fora range of depths of a single digital hologram captured usinga DIHM set up. Then border following is applied to theresulting binary image which produces a list of contours.Only contours within a specified range of lenghts are selected.These lists are segmented slices, or contour slices,of the in-focus features at the corresponding reconstructiondepth. Finally these contour slices are input to the marchingcubes algorithm in order to polygonise the surface. Thisallows a 3-D visualisation of the imaged sample. The stepsinvolved in this process were demonstrated using a test caseof a real world hologram of hair. Future work on this topicrequires DIHM captures of volumes of transmissive samples.231


Figure 12. 3-D visualisation of 200 repeated 2-D contour slices, shown at the top left corner,of a transmissive sampleReferences[1] D. Gabor, “A new microscopic principle,” Nature, vol.161, pp. 777–778, 1948.[2] E. N. Leith and J. Upatnieks, “Wavefront reconstructionwith diffused illumination and three-dimensionalobjects,” J. Opt. Soc. Am., vol. 54, pp. 1295–1301,1964.[3] T. M. Kreis, Handbook of Holographic Interferometry.Wiley-VCH, 2005.[4] U. Schnars and W. P. O. Jüptner, Digital Holography.Springer, 2004.[5] J. Garcia-Sucerquia, W. Xu, S. K. Jericho, P. Klages,M. H. Jericho, and H. J. Kreuzer, “Digital in-line holographicmicroscopy,” Appl. Opt., vol. 45, pp. 836–850,2006.[6] K. M. Molony, B. M. Hennelly, D. P. Kelly, and T. J.Naughton, “Reconstruction algorithms applied to inlinegabor digital holographic microscopy,” in preperation.[7] G. Pedrini, P. Frning, H. Fessler, and H. J. Tiziani,“Inline digital holographic interferometry,” Appl. Opt.,vol. 37, pp. 6262–6269, 1998.[8] F. Gori, “Fresnel transform and sampling theorem,”Opt. Eng., vol. 39, pp. 293–297, 1981.[9] A. Stern and B. Javidi, “Analysis of practical samplingand reconstruction from fresnel fields,” Opt. Eng.,vol. 43, pp. 239–250, 2004.[10] Y. Zhang, G. Pedrini, W. Osten, and H. J. Tiziani, “Reconstructionof in-line digital holograms from two intensitymeasurements,” Opt. Lett., vol. 29, pp. 1287–1789, 2004.[11] Y. Frauel, T. J. Naughton, O. Matoba, E. Tajahuerce,and B. Javidi, “Three-dimensional imaging and processingusing computational holographic imaging,”Proc. IEEE, vol. 94, pp. 636–653, 2006.[12] D. Mas, J. Garcia, C. Ferreira, L. M. Bernardo, andF. Marinho, “Fast algorithms for free-space diffractionpatters calculation,” Opt. Commun., vol. 164, pp. 233–245, 1999.[13] D. Mas, J. Prez, C. Hernndez, C. Vzquez, J. Miret, andC. Illueca, “Fast algorithms for free-space diffractionpatters calculation,” Opt. Commun., vol. 227, pp. 245–258, 2003.[14] C. P. McElhinney, J. B. McDonald, A. Castro,Y. Frauel, B. Javidi, and T. J. Naughton, “Depthindependentsegmentation of macroscopic threedimensionalobjects encoded in single perspectives ofdigital holograms,” Optics Letters, vol. 32, pp. 1229–1231, 2007.[15] J. Maycock, C. P. M. Elhinney, B. M. Hennelly,T. J. Naughton, J. B. M. Donald, and B. Javidi, “Reconstructionof partially occluded objects encodedin three-dimensional scenes using digital holograms,”Applied Optics, vol. 45, pp. 2975–2985, 2006.[16] J. W. Goodman, Introduction to Fourier Optics.Roberts & Company Publishers, 2004.[17] J. Maycock, B. M. Hennelly, J. B. McDonald,Y. Frauel, A. Castro, B. Javidi, and T. J. Naughton,“Reduction of speckle in digital holography by discretefourier filtering,” Journal of Optical Society ofAmerica A, vol. 24, no. 6, pp. 1617–1622, 2007.[18] F. H. Y. Chan, F. K. Lam, and H. Zhu, “Adaptivethresholding by variational method,” IEEE TRANSAC-TIONS ON IMAGE PROCESSING, vol. 7, pp. 468–473, 1998.[19] G. Bradski and A. Kaehler, Learning OpenCV ComputerVision with the OpenCV Library. O’Reilly Media,Inc, 2008.[20] S. Suzuki and K. Be, “Topological structural analysisof digitized binary images by border following,”Computer Vision, Graphics, and Image Processing,vol. 30, pp. 32–46, 1985.[21] K. OConor, H. P. Voorheis, and C. OSullivan, “3d visualisationof confocal fluorescence microscopy data,”Eurographics Ireland Workshop, pp. 49–54, 2004.232


[22] W. E. Lorensen and H. W. Cline, “Marching cubesa high resolution 3d surface construction algorithm,”Computer Graphics, vol. 21, pp. 163–169, 1987.[23] U. Tiede, K. H. Hoene, M. Bomans, A. Pommert,M. Riemer, and G. Wiebecke, “Investigation of medical3d-rendering algorithms,” Computer Graphics andApplications, IEEE, vol. 10, pp. 41–52, 1990.233


Speed up of Fresnel transforms for Digitalholography using pre-computed chirp and GPUprocessing.Nitesh Pandey, Bryan M. Hennelly, Damien P. Kelly and Thomas J. NaughtonAbstract— We show how the common Fresnel reconstructionof digital holograms can be speeded up on ordinary computers byprecomputing the two chirp factors for a given detector array sizeand then calling these values from memory during thereconstruction. The speedup in time is shown for varioushologram sizes. We also run the same algorithm on Nvidia GPUusing Matlab.Index Terms—Digital holography, Optics, imaging , FFTII. FRESNEL TRANSFORMConsider the Fresnel transform below which describes therelationship between the wavefield at 2 planes h(x,y) andΓ(ξ,η) separated by a distance diaλd⎡πλd⎤⎡−iπ⎣ λd⎡i2π⎣ λd2 22 2( ξ,η) = exp⎢−i( ξ + η )⎥∫ ∫h( x,y) × exp⎢( x + y )⎥ × exp⎢( xξ+ yη)⎥dxdyΓ⎣⎦∞ ∞−∞ −∞⎤⎦⎤⎦I. INTRODUCTIONDigital holography[1] is a fast growing field with applicationsin Microscopy[2], Metrology[3] and 3-D informationprocessing[4,5] and display[6] to name a few. In digitalholography, the holograms are recorded electronically by aCCD target. The real image can be reconstructed from thedigitally sampled hologram by numerically propagating thewavefield back to the plane of the object using the theory ofFresnel diffraction. Reconstructions based on the Fresneltransform are widely used for objects for large hologram sizesbecause they traditionally employ the fast Fourier Transform(FFT) algorithm [7] which reduces the computations requiredfor a NxN matrix from O(N 2 ) to O(NlogN) steps. The memoryconsumption and the speed of the FFT makes it a highlyimportant and useful algorithm. As CCD sensor sizes increasein terms of pixel numbers and density, the computationalcomplexity of the reconstruction also increases. Here we aimto show how the reconstruction using the Fresnel transformcan be simplified using pre computed chirp and phase factors.Manuscript received May 29th, <strong>2009</strong>. The research leading to these resultshas received funding from the European Community's Seventh FrameworkProgramme FP7/2007-2013 under grant agreement no. 216105N.Pandey, B.Hennelly and D.kelly are with Dept of computer science, NUIMaynooth e-mail: npandey@cs.nuim.ie).T.Naughton is with Dept of Computer sciene, NUI Maynooth and with2 University of Oulu, RFMedia Laboratory- where λ is the wavelength, d is the distance and a is theamplitude. In Digital holography, we deal with discreterepresentations of this transform since the hologram isdiscretized at the CCD into NxM samples at intervals of ∆xand ∆y in the x and y directions. ∆x and ∆y being the pixelpitches of the sensor. Direct discreteization of the Fresnelintegral gives the following⎡−iπΓ(mn , ) = exp⎢⎣ λdM 1N−12 2 2 2 ⎤ ⎡−iπ2 2 2 2 ⎤ ⎡ km ln . ⎤( m∆ξ+ n ∆η) × ∑∑ t( k,l) exp ( k ∆x+ l ∆y) exp2 i π(+ )⎥ ⎦⎥⎦− k=0 l=0⎢⎣ λdHere Γ(ξ,η) is the matrix of NxM points which describes theamplitude and phase distribution of the real image and ∆ξ and∆η are the pixel pitches in the reconstructed images. For athorough investigation of the resulting numerical algorithm,and the range of d over which it is useful, the reader canconsult [8]. We assume we have available to us, a hologram ofNxM size recorded with us of an object placed at a distance dfrom the camera. To calculate the real image field from thishologram, the following method is used.i) The first complex chirp is calculated and multiplied bythe original hologram. To calculate the chirp NxMcomputations are needed. This can be speeded up usingVector multiplications (for ex in Matlab) but the numberof computations remain the same.ii) A Discrete fourier transform of this matrix is taken.This is done using the FFT algorithm.iii) The resultant FFT is multiplied with the second chirpfactor which must be computed in the same manner as thefirst.In order to speed up the Fresnel reconstruction, we note thatthe input chirp is independent of the input hologram field. The234⎥⎦⎢⎣MN


second chirp and the constant factors are also independent ofthe input. If we precalculate a range of these matrices fordifferent distances and load the corresponding data frommemory during reconstruction, we can save 2NxMcomputations. This is advantageous for real time highresolution digital holographic systems. The time take to load apre-calculated matrix is significantly less than the time it takesfor ‘on the fly’ calculation. The computation reduces to 2vector multiplications and the FFT algorithm. The algorithmnow becomesi) Load the two chirp values for the reconstructiondistance d and multiply first chirp by hologramii) Take FFTiii) Multiply by second chirp.We further note that for the case of human visualization ofreconstructed holograms, which is of sole importance inholographic 3D-TV[9] and in applications such as holographicendoscopes[10], the phase factors (step iii) can be neglected.This offers an additional speedup on large matrices. To test theimprovement in reconstruction time, we used sections of asample hologram of a macroscopic object of height 3cmrecorded in an in-line like geometry on a 1392x1040 CCD,pixel size 6.45µm (AVT Dolphin) using a 785nm laser. Theaverage times taken for reconstruction and speed up achievedon a computer (Intel Pentium 4 3.0Ghz CPU 1Gb RAM), isshown in the table below.Hologram size Noloading(timeWithloadingSpeedupin seconds) (time inseconds)100x100 0.0613 0.04731 1.30x200x200 0.1027 0.07023 1.46x500x500 0.4390 0.26449 1.66x1000x1000 1.7360 0.99297 1.74x2000x2000 7.1904 4.22684 1.70xHologram sizeNoloading(timein seconds)Withloading(time inseconds)Speedup100x100 0.0629 0.0555 1.13x200x200 0.0646 0.0522 1.23x500x500 0.0854 0.0672 1.27x1000x1000 0.1819 0.0775 2.34x2000x2000 0.5336 0.2514 2.12xIII. CONCLUSIONSWe have shown the benefits of using a table of precomputedchirp matrix and phase factors on the speed of digitalhologram reconstruction on normal CPUs and GPUs. Thespeedup improves with larger matrices and occurs due to thefact that loading large data from memory takes very little timecompared to calculation ‘on the fly’. In this paper we havelimited our investigation to the direct method (see Equation 2)of computing the Fresnel transforms which is made up of twochirp multiplications and a FFT algorithm [8]. Other methodsof reconstruction also require calculation of chirp data whichis independent of the input hologram. The convolutionapproach for example [8] is based on the description of theFresnel Transform as a chirp multiplication in the Fresneldomain. This method requires calculation of a chirp and twoFFTs. Thus precalculating will lead to a time saving of half asthat in the direct fresnel approach. The slightly more accuratemethod based on the Fresnel-Kirchoff transformation requiresFFT followed by multiplication by a chirp like functionfollowed by a second FFT. Since this chirp like function(which reduces to a chirp in the paraxial Fresnelapproximation) is is independent of the input hologram field,its recalculation will also result in considerable time savings.A large number of these chirp matrices which cover a largedistance can be stored permanently in memory and can beaccessed by the program whenever the reconstruction isdemanded for a particular distance. Recently the use ofgraphics cards for General purpose computing is becomingpopular. Algorithms designed to exploit the parallel many-corecapability of the GPU offer a significant speedup (5x-20x)over CPUs. GPGPU as it is called has already been used by afew groups to show the speed of reconstruction of hologramson the GPU architecture [11,12]. Here we show how thereconstruction and preloading works on a NVIDIA Geforcegraphics card on a AMD Athlon 64 X2, 2.31Ghz, 2 Gb RAM .We use the Jacket[13] engine for Matlab to run our code onthe GPU. The results are shown in Table 2REFERENCESG[1] U. Schnars and W. Juptner, “Digital recording and numericalreconstruction of holograms,” Meas. Sci. Technol. 13, 85–101 (2002).[2] P. Marquet, B. Rappaz, P. J. Magistretti, E. Cuche, Y. Emery, T. Colomb,and C. Depeursinge, "Digital holographic microscopy: a noninvasivecontrast imaging technique allowing quantitative visualization of living cellswith subwavelength axial accuracy," Opt. Lett. 30, 468-470 (2005).[3] C. Wagner, W. Osten, S. Seebacher, “Direct shape measurement by digitalwavefront reconstruction and multiwavelength contouring,” Opt. Eng. 39 79-85 (2000).[4] B. Javidi and E. Tajahuerce, "Three-dimensional object recognition by useof digital holography," Opt. Lett. 25, 28-30 (2000)[5] Yann Frauel, Thomas J. Naughton, Osamu Matoba, Enrique Tajahuerce,and Bahram Javidi, "Three-dimensional imaging and processing usingcomputational holographic imaging," <strong>Proceedings</strong> of the IEEE, vol. 94, no. 3,pp. 636-653, March 2006.[6] Digital Holography and Three-Dimensional Display Principles andApplications, Poon, Ting-Chung (Ed.) 2006, XIII, 430 p.[7] The Fast Fourier Transform, Brigham, E.O, New York: Prentice-Hall,(2002)[8] D. Mas, J. Garcia, C. Ferriera, L.M. Bernardo, “Fast algorithms for freespace diffraction patterns calculation,” Opt. Comm. 164,233245,(1999).235


[9]L. Onural and H. Ozaktas, "Signal processing issues in diffraction andholographic 3DTV," in Proc. EURASIP 13th European Signal ProcessingConference (2005).[10] S. Schedin, G. Pedrini, H. J. Tiziani, and A. K. Aggarwal, "Comparativestudy of various endoscopes for pulsed digital holographic interferometry,"Appl. Opt. 40, 2692-2697 (2001).[11] L. Ahrenberg, P. Benzie, M. Magnor, and J. Watson, "Computergenerated holography using parallel commodity graphics hardware," Opt.Express 14, 7636-7641 (2006).[12] Tomoyoshi Shimobaba, Yoshikuni Sato, Junya Miura, Mai Takenouchi,and Tomoyoshi Ito, "Real-time digital holographic microscopy using thegraphic processing unit," Opt. Express 16, 11776-11781 (2008)[13] www. accelereyes.com236


<strong>2009</strong> China-Ireland International Conference on Information and Communications Technologies 1Twin removal in digital holography by means of speckle reduction.(Revised May <strong>2009</strong>)David S. Monaghan* 1 , Damien P. Kelly 1 , Nitesh Pandey 1 and Bryan M. Hennelly a .1 Department of Computer Science, National University of Ireland,Maynooth, Co. Kildare, Ireland.Abstract -- A method for numerically removing the twin image in on-axis digital holography, based on multiple digital holograms, isdiscussed. The digital holograms under examination are captured experimentally using an in-line modified Mach-Zehnderinterferometric setup and subsequently reconstructed numerically. The technique is suitable for a transmission geometry. Eachindividual hologram is recorded with a statistically independent diffuse illumination field. This is achieved by shifting a glass diffuserin the x-y plane of the object path. By recording the holograms in this manner the twin image, from a numerical reconstruction,appears as speckle. By reducing this speckle pattern the twin image can be effectively removed in the reconstruction plane. Atheoretical model is developed and experimental results are presented that validate this model.Index Terms—Digital Holography, On- Axis, Speckle, Twin reductionwhere aI. RX and RX are random amplitude and phaseINTRODUCTIONvalues respectively. We note that the random phase field atn recent years there has been a great deal of interest in the the output of the diffuser gives rise to both random amplitudeIfield of Digital Holography (DH) 1-5 and 3D display 6, 7 and and phase values (acapture technology. This is apparent by the number of R and R ) at our object plane, due todiffraction introduced by the finite extent of the diffuser. Forpaper published in the literature. Holography 8, 9 is a method simplicity however we assume that the diffuser is sufficientlyfor capturing the complex field of an object and thus limited large and the distance d 1 sufficiently small (see Figure 1) suchthree-dimensional structures can be obtained. Recent that the resulting speckle field in the object plane may betechnological improvements, such as CCD cameras, highpowereddesktop computers and spatial light modulators, have transmissive object asassumed to be delta correlated. We describe the effect of ourmade DH a viable alternative to traditional holography. DHU TX a TXexp j TX,boasts advantages such as digital storage, processing and(2)compression of holograms, transmission over existing digitaland write the field immediately after our object asinfrastructure. In this paper we will examine a method of twinremoval in digital holography based on a speckle 10, 11 reductionUX U TXU RX.technique.(3)This combined field, U(X) is now allowed to propagate to theCCD plane where it interfers with an ideal unit amplitudeII. THEORYplane wave, R(x) = exp[j2π(z-z c )/] and the resultingIn the follwoing section we present a simple theoretical modelto describe the behaviour of our optical system. Our aim hereis not to conduct a fully rigorous examination of the complexinteraction of multiple speckle fields and various apertures inthe system but rather to present a plausible description of thecomplex behaviour of the system. A more complete analysiswould take us far beyond the scope of this manuscript. Acolluminated plane wave is generated using a spatial filter andlens as depicted in Figure. 1. This colluminated plane wave isthen incident on a diffuser. We assume that the diffuser isoptically rough and imparts a random phase to the plane wavefront that emerges from the diffuser. This random phase fieldnow propagates to the object plane where it illuminates ourtransmissive target. We describe the random field thatilluminates our object asU RX a RXexp j RX,(1)Manuscript submitted May 29 th <strong>2009</strong>. Corresponding author: D. S.Monaghan e-mail: davidm@cs.nuim.ie Ph +35317083849, fax+35317083848.237interference pattern is recorded. We write the continiousintensity distribution incident upon the camera face asHx u zx Rx 2 ,(4 a)Hx I z I R u zxR * x u *zxRx,(4 b)where Iz , and I R are the DC terms corresponding to the objectand reference intensities respectively and „*‟ denotes complex conjugate operation. The latter two terms in Eq. (4 b)correspond to the real and twin image terms respectively. Thefield u z is related to our object field U(X) by a Fresneltransformu zx zUX(x),1u zxjzUXexp jz x X 2 dX ,(5)where z is the Fresnel transform operator. We now makesome more simplifying approximations. In practical DHsystems the continuous intensity field, Eq. (4 b) is recorded by


<strong>2009</strong> China-Ireland International Conference on Information and Communications Technologies 2a camera of finite physical extent using finite size pixels We now consider the twin image term. Examining thelocated at fixed distances from each other. Each of these derivation of Theorem 3 in Ref. [14] we find thatfactors, act to significantly limit the imaging performance of u *zx -z U * X(x). Using this result we re-write the twinDH systems and we refer the reader to [12] for more detail.image term U ˜ X asHowever for our purposes we do not need to consider theseaspects of the imaging process to get across the essence of ourU ˜ X -2z U * X(X).idea and so for simplicity we assume that the continuous field(8)H(x), Eq. (4 b) is available to us. We note that the DC terms The corresponding intensity is given bycan be removed either numerically 13 or by recording the I ˜n(X) U n(X)U * n(X)reference and object intensities separately and subtracting them from the captured hologram. Setting z = z c in Eq. (4 b), ˜I n(X) a Rna Texp j T Rn exp jremoving the dc terms and performing an inverse Fresnel2z X X 1 2 dX 1transform yields the following result*j2 Ax UX -z u * zx(X), aRnaT exp jTRn exp X X2 dX2 2z (6 a)Ax UX U ˜ X(9)(6 b) This result means that each intensity distribution due to thewhere U ˜ X is the twin image term.twin image term generates is a statistically independentspeckle pattern. Like in the previous case the a Texp j T To remove the twin image term requires that we capture term remains constant however now each component ismultiple digital holograms using a series of statistically multiplied by a random phase and then Fresnel transformed. independent speckle fields to illuminate our object. We Thus averaging over N intensity patterns gives the followingassume that each of these statistically independent fields has resultthe same average intensity M. Each of the resulting digital˜II TR (X) ˜1 I ˜2 ..... I ˜Nholograms are then reconstructed and averaged on an intensityNbasis in the object plane. We will now examine what happensI ˜ TR (X) M .to the real and twin terms as we average them on an intensity(10)basis in the object plane. Let us first consider the real image This latter equation suggests that the twin image becomesterm. Using a random speckle field, denoted „n‟, to illuminate gradually reduced as more intensity distributions are averagedour object the resulting reconstructed intensity pattern of the together. Finally it is important to consider that cross-termsreal image term is given by(interference between the real and the twin image) that ariseI n(X) UXU * X,when we calculate the intensity of Eq. (6 b). This interfernceI n(X) a Ta Rnexpj T Rn a Ta Rnexp j T Rn ,term can be re-expressed asI n(X) a Ta Rn 2 CT U U ˜ cos. R (11)(6)where We now average N of these intensity distributions formed byR can be shown to be a random variable. Thus we seethat the interference term described in Eq. (11) will average toN statistically different speckle fieldsI AR (X) I I ..... I zero and can be neglected. Although the analysis here is1 2 N,presented for transmissive objects (see Ref. [15]), it alsoNseems to apply to reflective objects too as a new series of2 22I AR 2 a(X) aR1 a R2 ... a RNT N ,experimental results indicate. We would like to acknowledgethat the theoretical description provided is relatively simplistichowever it does capture some essence of the underlying I AR 2(X) a TMphysical behavior of the system, as we shall now demonstrate(7) with a series of experimental resultswhere I AR represents the result of averaging together N realimage intensity reconstructions. We turn our attention to theterm in round brackets in Eq. (7) and note that the sum of Nstatistically different intensity patterns as N goes to infinityIII. RESULTSreduces to the average intensity value for an individual speckle The experimental set-up is shown in Figure 1. A 678nmdistribution. Therefore we may replace the term in the round meter laser is used. The wave-plate in this set-up is used inbrackets in Eq. (7), by the average intensity value for a given conjunction with a polarising beam splitter to allow the laserspeckle field, M. It is important to note that the intensity power between the two paths to be adjusted. A piezo mirror,2distribution for our object field, a T , is contained in Eq. (7). electronically controlled, is employed to impart a phase-shiftinto the object path of the set-up. 238


<strong>2009</strong> China-Ireland International Conference on Information and Communications Technologies 3PiezoMirrorMicroscopicobjective andcollimating lensDiffuserObjectLaserWave-PlateN.D.F.B.S.Spatial filter andB.S.collimating lensFigure 1. Experimental set-up used in the recording of reflective digital holograms. Where N.D.F. is a neutral density filter and B.S. isa polarising beam splitter.CCDCameraThis allows a Phase Shift Interferometry (PSI) 16, 17 digitalhologram to be captured. The x-y position of the diffuser ismoved in between each capture of a digital hologram toprovide a different and statistically independent specklepattern on each hologram.All the holograms presented in Figure 2 are reflectionholograms and were recorded using the experimental set-upshown in Figure 1. The reconstruction distance for theseholograms is 285mm. They have been numericallyreconstructed using the direct method (Fast Fourier transformbased technique) to implement the discrete Fresnel transform 14& 18 , (also see Eq 5).(e)Figure 2. Numerical reconstructions of a digital hologram showing (a) theDC term, twin and object, (b) single hologram and twin, (c) 14 hologramsadded together, (d) post-processing of (c), (e) zoomed in portion of (c) forcomparison with 14 PSI holograms added together in (f).(f)(a)(b)Figure 2(a) shows a reconstruction that contains a strongDC component (or zero order term), the twin term and theoriginal object. The DC component arises due to the intensityterms that appear as a product of the holographic process. InFigure 2(b) the DC component had been removed by anumerical high-pass filter and it can be clearly seen that theresultant reconstruction contains the original object and thetwin term, which has been reconstructed as a speckle patterndue the introduction of the diffuser (see Figure 1). Figure 2(c)shows the results when 14 separate holograms how beenadded together on an intensity basis. It can be seen that thetwin term has been significantly reduced when compared withFigure 2(b). The process of addition has produced abackground DC term, which has been removed in Figure 2(d).This DC term has been removed by subtracting the meanvalue of the background from the entire reconstruction.Figure 2(e) and (f) show a comparison between the specklereduction method, (e), and a PSI method, (f).(c)(d)239


<strong>2009</strong> China-Ireland International Conference on Information and Communications Technologies 4IV. CONCLUSIONThe presence of a twin image in on-axis digital holographyis a fundamental property of a holographic imaging system.The removal or reduction of this twin image is of principalimportance in digital holography as it is present in thereconstructed image as a source of noise. In this paper wehave examined a method of twin removal based on a specklereduction technique. We have shown that this method can beapplied to reflective objects in digital holography.REFERENCES[1] J. W. Goodman and R. W. Lawrence, “Digital imageformation from electronically detected holograms,” Appl.Phys. Lett. 11, 77–79 (1967).[2] U. Schnars and W. Juptner, “Direct recording of hologramsby a CCD target and numerical reconstruction,” Appl. Opt.33, 179–181 (1994).[3] E. Tajahuerce and B. Javidi, “Encrypting three-dimensionalinformation with digital holography,” Appl. Opt. 39, 6595–6601 (2000).[4] M. Liebling, T. Blu, and M. Unser, “Complex-waveretrieval from a single off-axis hologram,” J. Opt. Soc. Am.A 21, 367–377 (2004).[5] L. Onural and P.D. Scott, “Digital decoding of in-lineholograms,” Opt Eng 26, pp. 1124–1132. (1987)[6] U. Gopinathan, D. S. Monaghan, B. M. Hennelly, C. P. M.Elhinney, D. P. Kelly, J. McDonald, T. J. Naughton, and J.T. Sheridan, “A projection system for real world threedimensionalobjects using spatial light modulators,” J.Display Technol. 4(2), pp. 254–261, 2008.[7] D. S. Monaghan, U. Gopinathan, D. P. Kelly, T. J.Naughton, and J. T. Sheridan, “Systematic errors of anoptical encryption system due to the discrete values of aspatial light modulator,” Opt. Eng. 48(2), p. 027001, <strong>2009</strong>.[8] D. Gabor, “A new microscopic principle,” Nature (London)161, 777 (1948).[9] D. Gabor, “Microscopy by reconstructed wavefronts,”Proc. R. Soc. A 197, 454 (1949).[10] J. W. Goodman, Speckle Phenomena in Optics, RobertsCompany, 2007[11] J. W. Goodman, “Some fundamental properties ofspeckle,” J. Opt. Soc. Am. 66(11), pp. 1145–1150, 1976.[12] D. P. Kelly, B. M. Hennelly, N. Pandey, T. J. Naughton,W. T. Rhodes, “Resolution limits in practical digitalholographic systems,” (Under Review <strong>2009</strong>).[13] T. Kreis and W. Juptner, “Suppression of the dc term indigital holography,” Opt. Eng. 36, pp. 2357–2360, 1997.[14] F. Gori, “Fresnel transform and sampling theorem,” Opt.Comm. 39, 293–297 (1981).[15] D. S. Monaghan, D. P. Kelly, N. Pandey, B. M. Hennelly,“Twin removal in digital holography via specklereduction,” (Under Review 20009).[16] I. Yamaguchi and T. Zhang, “Phase-shifting digitalholography,” Opt. Lett. 22, pp. 1268–1270, 1997.[17] S. Lai, B. King and M. A. Neifeld, “Wave frontreconstruction by means of phase-shifting digital in-lineholography,” Opt. Comm. 173, 155-160 (2000)[18] J. Goodman, Introduction to Fourier Optics, 2nd ed.,McGraw-Hill, New York, 1966.240


Review of Twin Reduction and Twin RemovalTechniques in HolographyBryan M. Hennelly, Damien P. Kelly, Nitesh Pandey, David MonaghanAbstract—In this paper we review the major contributionsover the past sixty years to the subject of twin reduction and twinremoval in holography. We show that this collective work may bebroken down into a number of categories including the wellknown techniques of off-axis holography and phase retrieval.Index Terms—Holography, In-Line, On- Axis, Twin ReductionHI. INTRODUCTIONolography was invented by Gabor in 1948 1,2 . In his initialexperiments, involving electron microscopy, an objectwas irradiated with a radiation beam of strong coherence.The waves weakly scattered by the object interfered with thebackground wave on a photographic film where thisinterference pattern was recorded. This recorded intensitydistribution contains information about the amplitude and thephase of the incident object wavefield. The limits to the Gaborexperiment are the resolution of the material film and thecoherence of the radiation source. Gabor also showed how itwas possible to reconstruct the original object wavefield byilluminating the recorded film transparency with the originalbackground wave. However, the image of the reconstructedobject is marred by the presence of a twin image, an inherentartifact of the method. After the invention of the laser methodswould later be invented to cleverly evade this twin image byusing new experimental architectures, but this would imposegreater restrictions on the system, in particular on theresolution of the film. However for certain radiation sourcesthese new experimental architectures have no physicalimplementation or only crude and expensive equivalents. Inthese cases one must rely upon the initial Gabor architectureand the twin image remains a problem that must be dealt with.Such is true for many cases including x-ray holography,gamma-ray holography and electron holography. In the caseof digital (optical) holography the Gabor like in-linearchitectures also offer advantage as discussed below.Crystallographic structure is often determined usingdiffraction methods. While electron emission holography isuseful for studying surface structure, it’s anisotropic natureprevents the study of internal structure. The more isotropicManuscript submitted May 29 th <strong>2009</strong>. The research leading to these results hasreceived funding from the European Community's Seventh FrameworkProgramme FP7/2007-2013 under grant agreement no. 216105. All authorsare with the Computer Science Department, National University of IrelandMaynooth, Ireland. Corresponding author: Bryan Hennelly, ph+35317083849, fax +35317083848, email: bryan.hennelly@gmail.com241scatter of x-rays overcomes this limitation. Improvements inx-ray detectors have enabled the application of x-rayholography for crystallography 3 which is especially usefulbecause the recovered phase information offers animprovement over traditional diffraction techniques. As in thecase of gamma-ray holography 4 and electron emissionholography 5-10 , the in-line architecture is used and the shortdistance between the source and the object implies that thetwin image will be located very close to the reconstructedatom image. This creates a detrimental and unavoidablesource of noise. Low voltage and high voltage electronholography are particularly useful in the imaging of weaklyscattering objects such as DNA molecules 7-8 . The severeaberrations caused by lenses in electron imaging makes the inlineset-up the method of choice. Another advantage in thiscase is that the phase sensitivity of in-line electron holographyis particularly high.II. TWIN REDUCTION BY SUBTRACTIONIn 1951, soon after Gabor’s invention and many years beforeany off-axis architecture would be developed Bragg andRogers developed an innovative solution to the twin imageproblem 11-12 . The basic idea is that the disturbance caused bythe unwanted twin image on the wanted plane is, in fact, itsown hologram. If a second hologram is taken from theoriginal object, which accurately reproduces the disturbancethat is due to the unwanted image at the wanted plane, the firstreconstruction can be corrected by subtraction. Using acollimated light source we must take the second hologram attwice the distance of the first, since the wanted and unwantedimages are formed at equal distances from the hologram oneither side of it. The divergent case is somewhat morecomplicated. It is notable that the method works well withboth phase-contrast and amplitude-contrast. This method fellinto obscurity until later advances in digital technologyallowed for a simplified subtraction process 12 . A similarsubtraction based technique is reported in 1956 13 , where twoholograms must be recorded and the object must berepositioned for the second recording enabling a canceling ofthe twin image term. Xiao et. Al. showed 14 how the Bragg andRogers method could be applied in x-ray holography in realtime by recording two holograms while taking advantage ofthe penetration property of x-ray radiation.III. TWIN ELIMINATION BY OFF AXIS HOLOGRAPHYAfter the invention of the Laser, Leith and Upatnieksproposed 15 in 1963 a new experimental optical architecture


that enables the complete separation of the twin image termand the zero order term from the wanted image. In theirmethod the reference beam was incident on the hologramplane at some angle relative to the normal. In this way thetwin images were modulated on well-separated carrier spatialfrequencies. The range of separation of the terms is dependenton the angle of the reference beam. The significantlyincreased bandwidth of the hologram places a much greaterrequirement on the resolution of the recording material.Furthermore the architecture imposes the need for numerousoptical elements, in particular a beam splitter, which are notreadily available in many areas of holography. The increasedbandwidth of the hologram for this set-up poses a problem fordigital holography. In DH, the pixilated recording camerashave resolutions an order of magnitude less than commercialphoto materials. Thus the bandwidth is already severelylimited and the use of an off-axis architecture will only serveto limit it further. Nevertheless Cuche et. al. proposed 16 andexperimentally validated the off axis technique for real timedigital holographic recording. In this case it is possible todigitally spatially filter the hologram to completely remove theDC term and the twin image. In 1966 an alternative to the offaxismethod was proposed 17 for recording in the OpticalFourier Transform domain. In this case the reference beam, apoint source in the object plane, can be placed adjacent to theobject to effectively create an off axis hologram in the Fourierdomain and to spatially separate the twin images.IV. TWIN REDUCTION BY FRAUNHOFERIn 1966 another twin reduction method was proposed 18 byeffectively recording in the far field of the object. Whenreplayed in the far field the image of the object will appear butthe twin image will be so spread out that it appears as a DCterm and is therefore effectively removed. A year later themethod was reviewed and applied to particle analysis 19 andthis was followed by a further reassessment almost a decadelater 20 . The method was also applied to in-line electronholography to view undecagold cluster. The method is foundto work well with the phase contrast technique. In [22] Garciaet. al. extensively review lensless in-line holographicmicroscopy and show that the twin image is of noconsequence in the reconstructions. They comment that this isbecause of the diverging reference beam. While thereconstruction of the object image converges uponreconstruction, the out of focus twin image continues todiverge and effectively forms the Fraunhofer case where it is aconstant DC term in the ‘far-field.’V. TWIN REDUCTION BY SINGLE SIDEBAND ELIMINATIONIn 1968 Bryngdahl and Lohmann developed 23 a method tosuppress the twin image. The method consists of filtering outhalf of the spatial frequency spectrum from the transmittedsignal during the recording of the hologram and then filteringout half of the spatial frequency spectrum from the signalduring the reconstruction process. The authors suggest that theresult of this process will be that “each point scatterer in theobject will cause only one half of a zone-plate pattern in thehologram plane.” The reconstructed signal will have the twin242images on separate sides of its Fourier spectrum and they canbe easily isolated by spatial filtering.VI. TWIN REMOVAL BY PHASE RETRIEVALThe 1970s and 1980s saw the development of a new field ofresearch for recovering the phase of a wave field. Thesemethods do not require interference and are collectivelyknown under the name of “phase retrieval algorithms.” Wecan divide these phase retrieval schemes into two subsets; (i)deterministic 24 and (ii) iterative 25-27 . Deterministic phaseretrieval algorithms are based on analysing the propagation ofan intensity diffraction pattern. Iterative methods on the otherhand rely on recursive ‘ping pong’ algorithms that convergeover time based on some constraints that are imposed in everyiteration of the algorithm. They are often highly reliant onsome initial condition set at the outset of the iteration process.These ping pong algorithms often require two recordeddiffraction patterns but some have been developed that canwork with only a single recorded intensity and a suitableconstraint. While both deterministic and iterative phaseretrieval algorithms have both been shown to work with somesuccess, the iterative class has received by far the greatestattention in the literature. In [28] Gerchberg demonstratedhow phase retrieval could be used with electron microscopy torecover the phase of the wave field. Despite the initial promiseof phase retrieval algorithms, they have never managed toachieve results on a par with holographic methods. Howeverthere has been considerable interest in their usefulness inremoving the twin image for in-line holography 29-41 .Twin removal has been successfully accomplished withboth deterministic phase retrieval 29 and iterative phaseretrieval 30-41 . In [29] deterministic phase retrieval is combinedwith numerical simulations of light propagation to solve thetwin image problem. In [30] and [31] Liu and Scott performedthe first investigation of using iterative phase retrieval forremoving the twin image. The authors note that “retrieval ofphase permits separation of real-object distributions from thetwin-image interference that accompanies conventional opticalreconstruction.” However, this algorithm is limited to purelyabsorbing objects and cannot recover phase shifts caused bytransmissive objects. Unfortunately the same can be said formany similar algorithms that followed. Liu continued toimprove upon these algorithms by incorporating into them anoise constraint based on a model of additive noise 32 . Korenet. al. developed 33-34 a new constraint for these iterative twinremoval algorithms. The noticed that the out of focus imagewas considerably larger in area than it is in focus counterpartand they unitized this fact to form their constraint. Theyshowed their algorithm to work for complex objects as well asabsorbing objects. These phase retrieval twin removalalgorithms were soon successfully applied to in-line x-rayholography 35-36 . In [37] another constraint is developed forthese ping-pong algorithms, this time to remove the twinimage from electron holographic images. In the object planethe phase is replaced with a parabolic phase (similar to theexpected shape of the object surface) and in the image planethe intensity is replaced with the measured intensity. Themethod only works well with pure amplitude objects.


Advances were made in understanding the samplingrequirements of these phase retrieval algorithms in [38]. Forreal absorbing objects another iterative algorithm has beendeveloped 39 to remove the twin image, this time utilizing boththe Gerchberg-Saxton algorithm and the Fraunhofertechnique. The iterative phase retrieval algorithm wasextended for the case of multiple recordings of different inlineholograms in [40]. Very recently 41 improvements havebeen made on the iterative technique by using a better modelof the object. This improved method works well for phaseobjects as bell as for pure absorbing objects.resonance. Gabor and Goss also implemented an earlytechnique 56 based on phase shifting in which two hologramswere captured with a quarter wave phase shift was usedbetween captures. Reconstruction was optical and required theuse of a “quadrature prism” to combine the previous capturesand remove the virtual image. In [57] a number of these phasemodulation twin removal methods are reviewed. In the case ofoptical scanning holography, techniques have beenproposed 58-60 to remove the twin was proposed involvingsimultaneous acquisition of sine and cosine Fresnel zone-lensplate coded imagesand adding the two holograms.VII. TWIN REDUCTION BY LINEAR FILTERING/DIGITALDECODINGThe first digital signal processing technique for the removal ofthe twin image 42 appeared in 1979 but provided poor resultsand received little interest. Improved DSP based algorithmswere developed some years later by Onural and Scott 43-45 .They described linear filtering operations to decode theinformation contained in the holograms. The filter is a seriesexpansion of the inverse of that operator that maps objectopacity function to hologram intensity. However their workdid not allow for phase objects. Further advances in linearfiltering for twin reduction that did allow in some cases forphase objects. 46 Maleki and Devaney [47] have proposed adeconvolution algorithm to remove the twin but the method isplagued with the singularity problem, and the calculation iscomplex. Spence et. al. described another non iterativemethod 48 but this too is limited to non phase objects. Yang et.al. have developed an algorithm 49 that seems quite similar toprevious work on subtraction holography discussed earlier.They devise a DSP method that relies on multiplereconstructions and subtractions. A similar method isdeveloped in [50], seemingly unaware of work in [49],however their algorithm requires two in-line holograms to becaptured.VIII. TWIN REMOVAL BY PHASE SHIFTING HOLOGRAPHY ORSOME FORM OF PHASE MODULATIONIn 1997 Yamaguchi and Zhang developed a new method 51 forrecording digital holograms free from the twin image, knownas ‘phase shifting digital holography. The method allowed forthe use of the in-line architecture but required a number ofseparate interferograms to be captured in which a phase shiftis introduced to one of the interfering wavefields betweencapture. These phase shifts are usually effected by rotation ofa quarter or half wave plate or through the minute vibration ofa mirror. A similar method is presented in [52]. Chen et. al.have presented a method 53 that allows for the phase-shiftingtechnique to be applied with an arbitrary phase shift and justtwo captures. Kim et. al. have proposed a method 54 forremoving the twin image from a white light real timeholographic system by utilizing polarization optics and theaddition of images. However their method is based on thetriangular interferometer that has numerous disadvantages.For complex gamma ray holography a phase shiftingtechnique 55 has also been applied based on changing the phaseof the nuclear scattering amplitude by detuning from the243IX. TWIN REDUCTION BY ADDING IMAGES AT DIFFERENTFOCUSA number of related techniques have appeared in the literaturerelating to twin removal in inline electron holography 9,10, 62,63 .It appears that one of the most successful methods for twinreduction in electron holography is by recording multipleholograms of the same objects with different wavelengths andthen superimposing the intensities of the reconstruction. Theout of focus twin image will change for each wavelength andaverage out. A similar technique has also been developedbased on distance instead of wavelength 62,63 . By recording aseries of holograms of the same objects but at differentdistances implies that the out of focus images will be differentin each reconstruction and will integrate to approximate aconstant value if a sufficient number of holograms arerecorded and reconstructed.X. TWIN REMOVAL BY SPATIAL FILTERING OFRECONSTRUCTION PLANESPedrini et. al. propose the first instance 64 of the spatialfiltering of reconstruction planes of digital holograms. Thisinvolves cutting out the wanted digitally reconstructed imagefrom its surrounding pixels. However this area still containedconsiderable noise from the unwanted twin image. In [16]traditional spatial filtering in the Fourier domain was appliedto an off axis digital hologram. Denis et. al. have proposed anovel method 65 of spatially filtering the reconstructiondomain. It was shown that by cutting out the reconstructedfocused unwanted twin and returning to the plane of thevirtual wanted image by numerical propagation one could freeoneself of the unwanted noise. The method was proposed onlyin the area particle holography and the removal of the twinimages was a manual operation. In this paper we propose theuse of a similar technique for macroscopic objects. We discussfor the first time the relative spreading of the unwanted twinimage and the wanted image and how one might manage thisspreading in the numerical reconstruction techniques.XI. DC REMOVALThroughout this discussion we have paid little attention to thezero order term – i.e. the intensity terms that appear as a byproduct of the holographic process. In some cases this artifactis far noisier than the unwanted twin Many of the methodsdiscussed above will remove this term in addition to removingthe unwanted twin. A number of methods have been


developed in the literature to remove this term alone toaugment those metthods that do not. These methods are basedon spatial filtering of the hologram 66 , subtractingstochastically different holograms 67 , phase-shifting 68 and bysubtracting the numerical generated intensity of the object andreference waves from the digital hologram 69 .XII. CONCLUSIONIn this paper we have categorically reviewed the subject oftwin reduction and twin elimination in holography. Thissubject is of paramount importance in interference imagingdue to the presence of the twin as a source of noise in thereconstructed image. We have reviewed over sixty years ofresearch on this area. This paper will serve as a valuablereference to those interested in this subject.REFERENCES[1] D. Gabor, “A new microscopic principle,” Nature (London) 161, 777(1948).[2] D. Gabor, “Microscopy by reconstructed wavefronts,” Proc. R. Soc. A197, 454 (1949).[3] M. Tegze and J. Faigel, “X-ray holography with atomic resolution,”Nature (London) 380, 49 (1996).[4] P. Korecki and J. Korecki, “γ-Ray Holography – Three-DimensionalImaging of a Local Atomic Structure” Hyperfine Interact. 144, 85(2002).[5] J. F. Arocena, T. A. Rotwell, and M. R. A. Shegelski, “Iterativereconstruction of in-line electron holograms,” Micron 36, 23 (2005)[6] J.C.H. Spence, “STEM and Shadow-imaging of Biomolecules at6eVBeam Energy,” Micron 28, 116, 1997[7] H. -W. Fink, H. Schmid, E. Ermantraut, and T. Schulz, "Electronholography of individual DNA molecules," J. Opt. Soc. Am. A 14, 2168-2172 (1997)[8] T. Matsumoto, T. Tanji and A. Tonomura: “Visualization of DNA inSolution by Fraunhofer In-Line Electron Holography: II. Experiments.”Optik 100 (No.2) (1995) 71-74[9] S.Y. Tong, H. Hua Li, and H. Huang, “Energy extension in threedimensionalatomic imaging by electron emission holography,” Phys.Rev. Lett. 67, 3102 (1991).[10] J. J. Barton, “Removing multiple scattering and twin images fromholographic images” Phys. Rev. Lett. 67, 3106 (1991).[11] W. L. Bragg and G. L. Roger, Elimination of the Unwanted Image inDiffraction Microscopy Nature (London) 167, 190 (1951).[12] G. L. Rogers, "In-line soft-x-ray holography: the unwanted image," Opt.Lett. 19, 67- (1994)[13] P. Kirkpatrick and H. M. A. El-Sum, "Image formation by reconstructedwave fronts. I. Physical principles and methods of refinement," J. Opt.Soc. Am. 46, 825- (1956)[14] T. Xiao; H. Xu; Y. Zhang; J. Chen; Z. Xu, “Digital image decoding forin-line X-ray holography using two holograms,” Journal of ModernOptics, 45:2, 343 – 353[15] E. Leith and J. Upatnieks, J. Opt. Soc. Am. 53, 1377 (1963).[16] E. Cuche, P. Marquet and C. Depeursinge, “Spatial filtering for zeroorderand twin-image elimination in digital off-axis holography,” Appl.Opt. 39, 4070 (2000).[17] G. W. Stroke, D. Brumm, A. Funkhouser, A. Labeyrie and R. C.Restrick, “on the absence of phase-recording or ‘twin-image’ separationproblems in ‘Gabor’ (in-line) holograms” BRIT, J. APPL. PHYS., 1966,VOL. 17, 497[18] J. B. DeVelis, G. B. Parrent Jr., and B. J. Thompson, "ImageReconstruction with Fraunhofer Holograms," J. Opt. Soc. Am. 56, 423-(1966)[19] B. J. Thompson, J. H. Ward, and W. R. Zinky, "Application of hologramtechniques for particle size analysis," Appl. Opt. 6, 519- (1967)[20] G. A. Tyler and B. T. Thompson, "Fraunhofer holography applied toparticle size analysis: a reassessment," Opt. Acta 23, 685-700 (1976).244[21] T. Matsumoto, T. Tanji, A. Tonomura, “Phase-contrast visualization ofan undecagold cluster by in-line electron holography” Ultramicroscopy54 317 (1994)[22] J. Garcia-Sucerquia, W. Xu, S. K. Jericho, P. Klages, M. H. Jericho, andH. J. Kreuzer, "Digital in-line holographic microscopy," Appl. Opt. 45,836-850 (2006)[23] O. Bryngdahl and A. Lohmann, "Single-Sideband Holography," J. Opt.Soc. Am. 58, 620- (1968)[24] M. R. Teague, "Deterministic phase retrieval: a Green's functionsolution," J. Opt. Soc. Am. 73, 1434- (1983)[25] R. W. Gerchberg and W. O. Saxton, “A practical algorithm for thedetermination of phase from image anddiffraction plane pictures,” Optik35, 227–246 (1972).[26] J. R. Fienup, "Reconstruction of an object from the modulus of itsFourier transform," Opt. Lett. 3, 27- (1978)[27] J. R. Fienup, "Phase retrieval algorithms: a comparison," Appl. Opt. 21,2758- (1982)[28] R.W. Gerchberg. “Holography without fringes in the electronmicroscope” Nature 240 (1972), p. 404.[29] J.B. Tiller, A. Barty, D. Paganin, K.A. Nugent, “The holographic twinimage problem: a deterministic phase solution,” Optics Communications183 2000.7–14[30] G. Liu and P. D. Scott, "Phase retrieval for in line holograms," in<strong>Proceedings</strong> of the Nineteenth Annual Conference on InformationSciences and Systems (Johns Hopkins U. Press, Baltimore, Md., 1985),pp. 237-241.[31] G. Liu and P. D. Scott, "Phase retrieval and twin-image elimination forin-line Fresnel holograms," J. Opt. Soc. Am. A 4, 159- (1987)[32] G. Liu, “Object reconstruction from noisy holograms,” OpticalEngineering 29(01), pp.19-24 (1990)[33] G. Koren, D. Joyeux, and F. Polack, "Twin-image elimination in in-lineholography of finite-support complex objects," Opt. Lett. 16, 1979-(1991)[34] G. Koren, F. Polack, and D. Joyeux, "Iterative algorithms for twin-imageelimination in in-line holography using finite-support constraints," J.Opt. Soc. Am. A 10, 423- (1993)[35] S. Lindaas, M. Howells, C. Jacobsen, and A. Kalinovsky, "X-rayholographic microscopy by means of photoresist recording and atomicforcemicroscope readout," J. Opt. Soc. Am. A 13, 1788-1800 (1996)[36] Kodama, M. Yamaguchi, N. Ohyama, T. Honda, K. Shinohara, A. Ito, T.Matsumura, K. Kinoshita, K. Yada, “Image reconstruction from an inlineX-ray hologram with intensity distribution constraint,” Opt.Commun. 125, 36-42, (1996)[37] L. Bleloch, A. Howie, and E. M. James, “Amplitude recovery in Fresnelprojection microscopy,” Appl. Surf. Sci. 111, 180 (1997).[38] J. Miao, D. Sayre, and H. N. Chapman, “Phase retrieval from themagnitude of the Fourier transforms of nonperiodic objects,” J. Opt. Soc.Am. A 15, 1662-1669 (1998)[39] X. M. H. Huang, J. M. Zuo, and J. C. H. Spence, “Wavefrontreconstruction for in-line holograms formed by pure amplitudeobjects”Appl. Surf. Sci. 148, 229 (1999).[40] Y. Zhang, G. Pedrini, W. Osten, and H. Tiziani, "Whole optical wavefield reconstruction from double or multi in-line holograms by phaseretrieval algorithm," Opt. Express 11, 3234-3241 (2003)[41] T. Latychevskaia and H.-W. Fink, “Solution to the Twin Image Problemin Holography,” Phys. Rev. Lett. 98, 233901 (2007).[42] Marie, K.H.S., Bennett, J.C., Anderson, A.P. Digital processingtechnique for suppressing the interfering outputs in the image from aninline hologram, Electron Lett, 15 (1979) 241-243[43] L. Onural and P. D. Scott, "A digital filtering system for decoding in-lineholograms," in <strong>Proceedings</strong> of the 1985 IEEE Conference on Acoustics,Speech, and Signal Processing (Institute of Electrical and ElectronicsEngineers, New York, 1985), pp. 708-711.[44] L. Onural, "Digital decoding of in-line holograms," Ph.D. dissertation(State University of New York at Buffalo, Amherst, New York, 1985).[45] L. Onural and P.D. Scott, Digital decoding of in-line holograms. OptEng 26, pp. 1124–1132. (1987)[46] K. A. Nugent, “Twin-image elimination in Gabor holography,” Opt.Commun. 78, 293 (1990).[47] M. H. Maleki and A. J. Devaney, “Noniterative reconstruction ofcomplex-valued objects from two intensity measurements,” OpticalEngineering 33(10), pp. 3243–3253, 1994.


5[48] J.C.H. Spence, X. Zhang and W. Qian, in: Electron Holography, eds. A.Tonamura, L.F. Allard, G. Pozzi, D.C. Joy and Y.A. Ono (ElsevierScience, 1995) pp. 267-276.[49] S. Yang, X. Xie, Y. Zhao, C. Jia, “Reconstruction of near-field in-linehologram,” Optics Communications 159 1999.29–31[50] A. Zhang and X. Zhang, "Reconstruction of a complex object from twoin-line holograms," Opt. Express 11, 572-578 (2003)[51] Yamaguchi and T. Zhang, "Phase-shifting digital holography," Opt. Lett.22, 1268-1270 (1997)[52] S. Lai, B. King and M. A. Neifeld, “Wave front reconstruction by meansof phase-shifting digital in-line holography,” Opt. Commun. 173, 155-160 (2000)[53] G. L. Chen, C. Y. Lin, H. F. Yau, M. K. Kuo, and C. C. Chang, "Wavefrontreconstruction without twin-image blurring by two arbitrary stepdigital holograms," Opt. Express 15, 11601-11607 (2007)[54] S.-G. Kim, B. Lee, and E.-S. Kim, “Removal of bias and the conjugateimage in incoherent on-axis triangular holography and real-timereconstruction of the complex hologram,” Appl. Opt. 36, 4784–4791(1997).[55] P. Korecki, G. Materlik, and J. Korecki, “Complex gamma-rayhologram: solution to twin images problem in atomic resolutionimaging,” Phys. Rev. Lett. 86, 1534–1537 (2001).[56] D. Gabor and W. P. Goss, "Interference Microscope with TotalWavefront Reconstruction," J. Opt. Soc. Am. 56, 849- (1966)[57] Y. Takaki, H. Kawai, and H. Ohzu, "Hybrid Holographic MicroscopyFree of Conjugate and Zero-Order Images," Appl. Opt. 38, 4990-4996(1999)[58] K. Doh, T. -C. Poon, M. H. Wu, K. Shinoda and Y. Suzuki, “Twinimageelimination in optical scanning holography,” Optics & LaserTechnology, 28, 135-141 (1996)[59] P. Sun and J.-H. Xie, “Method for reduction of background artifacts ofimages in scanning holography with a Fresnelzone-plate codedaperture,” Appl. Opt. 43, 4214–4218 (2004).[60] T. -C. Poon, T. Kim, G. Indebetouw, B. W. Schilling, M. H. Wu, K.Shinoda, and Y. Suzuki, "Twin-image elimination experiments for threedimensionalimages in optical scanning holography," Opt. Lett. 25, 215-217 (2000)[61] K. Doh, T.-C. Poon, and G. Indebetouw, “Twin-image noise in opticalscanning holography,” Opt. Eng. 35, 1550–1555 (1996).[62] Weierstall, U., Huang, X. and Spence, J., Twin image suppression andforward focussing in low voltage in-line electron holography. Proc.MSA, ed. G. Bailey. Jones & Begell, New York. 1997[63] Lin, J. A. and Cowley. J. M., Reconstruction from in-line electronholograms by digital processing. UltramieroAcopy. 19, 179 189, 1986[64] Pedrini, P. Fröning, H. Fessler, and H. J. Tiziani, "In-Line DigitalHolographic Interferometry," Appl. Opt. 37, 6262-6269 (1998).[65] Loıc Denis, Corinne Fournier, Thierry Fournel, Christophe Ducottet,“Twin-image noise reduction by phase retrieval in in-line digitalholography,” <strong>Proceedings</strong> of SPIE Vol. 5914, 59140J-1 (2005)[66] T. Kreis and W. P. O. Juptner, "Suppression of the dc term in digitalholography," Opt. Eng. 36, 2357 (1997).[67] N. Demoli, J. Mestrovic, and I. Sovic, "Subtraction Digital Holography,"Appl. Opt. 42, 798-804 (2003)[68] Y. Zhang, Q. Lü and B. Ge, “Elimination of zero-order diffraction indigital off-axis holography,” Opt. Commun. 240, 261-267, (2004).[69] G. -L. Chen, C. -Y. Lin, M. -K. Kuo, and C. -C. Chang, "Numericalsuppression of zero-order image in digital holography," Opt. Express 15,8851-8856 (2007)245


Section 6BINTELLIGENT SYSTEMS246


Generating Initial Codebook of Vector Quantization Based on the MaximumRepulsion DistanceChen Gang, Zhong ShengThe International School of Software, Wuhan University, Wuhan, Chinaaeassist@yahoo.com.cnAbstractA definition of repulsion distance is given inthis paper. After an investigation on someexperiments to compare the performance of twoimportant existing algorithms that generate initialcodebook of vector quantization, a new algorithmis proposed on the concept of repulsion distance.Some further experiments with the new algorithmare performed to verify the feasibility of it and toshow its performance advantages. Finally, someatypical cases when using these algorithms arediscussed.Keywords: vector quantization, initial codebook,repulsion distance1. IntroductionVector quantization is a widely usedtechnique that compresses signals. In vectorquantization, every K contiguous signal values arecombined into a group, forming a K-dimensionalsignal vector. And then, all these vectors whichform a training set, are employed to performvector quantization. Given a training set ofvectors in the K-dimensional space, the codebookgeneration algorithm can find out an appropriatedivision of all these vectors. Every area in thedivision is named a cell. The aim of vectorquantization is to represent every vector in thetraining set with the codevector of itscorresponding cell. During this process, thedeviation between the codevector and the originalvector is referred to as distortion. The codebookachieves its excellent performance when itsdistortion is minimal. Generally speaking, it isvery difficult to generate a global optimalcodebook. Thus, common algorithms usuallygenerate a local optimal solution.Generating codebook is the most importantstep in vector quantization. The LBG algorithm [1] ,proposed by Linde, Buzo and Gray in 1980, is themost frequently used algorithm of generatingcodebook. It begins from an existing initialcodebook, and iterates to generate the finalcodebook, according to the necessary conditionsof optimal vector quantizers, that is, the NearestNeighbor Condition, the Centroid Condition, andthe Zero Boundary Probability Condition. For theperformance of the codebook is always gettingbetter after any iteration, the LBG algorithm isconvergent. However, a large quantity ofexperiment data shows that the rate ofconvergence and the performance of the finalcodebook with LBG algorithm are directlyinfluenced by the initial codebook. So, it is aproblem deserving of study in vector quantizationhow to generate a better initial codebook.To select randomly an enough number ofvectors from the training set as the initialcodebook is one of the usual ways to generate aninitial codebook. In this way, an initial codebookcan be generated in a very short time, for it needsonly a small amount of calculation. However, inmany cases, the initial codebooks generated inthis way are not satisfying enough. Besides, thereare many other feasible algorithms that generateinitial codebook, such as the deleting algorithm [3] ,the pairwise nearest neighbor algorithm, theproduct algorithm [1,4,5] , the splitting algorithm [2] ,the variance classifying algorithm, and so on.2. Common AlgorithmsThe deleting (DEL) algorithm is widely usedto generate initial codebook in literatures onstatistical clustering. This algorithm begins fromthe entire training set of vectors, and thenselectively deletes vectors until the number of thevectors remain comes up to the requirement. Thus,247


the vectors remain just form the initial codebook.Suppose d ( x, y ) measures the distortionbetween vectors x and y , the DEL algorithmcan be described as follows.(1) Initial state: given a threshold whichis a proper constant, given the training set ofvectors X { x , i 0,1, 2, , N 1}whose sizeiis N , given the codebook C { y i} whose sizeis m (at the very beginning, m 0 ), and giventhat the size of the final codebook should beM ( M 1);(2) Set i 0 , C C { x }, m m 1 ;(3) Set i i 1 , if i N , go to (4);otherwise, go to (5);(4) Set , i 1 , here is apositive decimal which is less than 1;(5) If x C , go to (3);i(6) Find y Cj , min , i j i kd d x y d x y ;0kM(7) If d , go to (3);i, so as to make sure(8) Set C C { xi}, m m 1 ;(9) If m M , go to (3); otherwise, stop.In addition to the basic deleting algorithm,the pairwise nearest neighbor (PNN) algorithmproposed by Equitz, et cetera is also a kind ofdeleting algorithm. This algorithm firstlyconsiders each vector in the training set as thecodevector, that is to say, each vector occupies itsown cell. And then, the pairwise nearest two cellsare merged into a new cell continually, until thenumber of the cells achieves requirement.Likewise, suppose d ( x, y ) measures thedistortion between vectors x and y , the PNNalgorithm can be described as follows.(1) Set the number of codevectors n N , allthe cells are R , i 0,1, 2, , N 1, and theicorresponding centroids arey x , i 0,1, 2, , N 1;i i(2) Calculate the distortion of each pairwisevectors y andlm0 l m n 1;y , i.e. , d y y ,(3) If d , min d , y y y y ,ij0lmn10 l m n 1 , merge cells R and R ,ijmeanwhile, update codevector y toillmmR y R yi i j jy , here, R is theiiR Rnumber of vectorsijR contains. And then, ifij n 1 , remove the codevector y ; otherwise,jset y y , R Rj n1j n , and remove the1codevector y , set n n 1 ;n1(4) If n M , stop; otherwise, go to (2).To compare the performance of the finalcodebooks generated by the two algorithms (DELand PNN), we carried out an experiment on thetwo-dimensional surface. First of all, wegenerated several points on the two-dimensionalsurface to constitute the training set of vectors.And then, we use these algorithms respectively togenerate two initial codebooks, which will then bepassed to the LBG iterations. At last, we can judgewhich algorithm generates a better initialcodebook from the final distortion of them afterthe LBG iterations.We did two groups of experiments. In eachgroup, we generated randomly 100two-dimensional vectors (points) to constitute thetraining set. The difference is that in the firstgroup, all the points in the training set weredistributed within a circular region with center atthe point 0, 0 and radius equal to 0.5. While inthe second group, all the points in the training setwere distributed within a rectangular regionenclosed by four lines: x 0.5 , x 0.5 ,y 0.5 , y 0.5 . In each group, the finalcodebook contained only 5 codevectors, and, theexperiment was repeated 10,000 times. Werecorded how many times one of the twoalgorithms respectively generated a better initialcodebook during the process.Figure 1. Training points within a circularregion248


Figure 2. Training points within a rectangularregionThe distribution of the training set of pointsin the two groups of experiments, drawn withMatlab, are shown in Figure 1 and Figure 2.The results of the experiments are shown inTable 1 and Table 2. Note that, since sometimescodebooks generated by the two algorithms mayresult in the same performance, they two may beconsidered as “better” in one experiment. So, thesum of the number of their times better is greaterthan 10,000.Table 1. Results of the 1st groupAlgorithm Times better ConclusionDEL 5853DEL is better.PNN 4994Table 2. Results of the 2nd groupAlgorithm Times better ConclusionDEL 5215PNN is better.PNN 6531We can draw a conclusion from the resultsabove that the DEL algorithm and the PNNalgorithm perform differently in differentcircumstances. When the training points are in acircular region, the DEL algorithm performs better;but when the training points are in a rectangularregion, the PNN algorithm performs better.3. The Algorithm Based on theMaximum Repulsion DistanceFrom the experiments, we know that theperformance of the final codebooks developedfrom the initial ones which are respectivelygenerated by the DEL algorithm and the PNNalgorithm are similar to each other. To try to getan even better performance, we proposed a newalgorithm of generating initial codebook, which isbased on the maximum repulsion distance, theMRD algorithm. This algorithm is on the basisthat the initial codevectors should be as dispersiveas possible. That is to say, the initial codebookmust be good enough only if all the codevectorsare selected from different cells of a good divisionof the training set of vectors. Thus, to ensure thatthe codevectors are as dispersive in the trainingset as possible, and that they are selected fromdifferent cells, the MRD algorithm will alwaysselect the farthest vector from the selectedcodevectors as the next one added into the initialcodebook in every iteration.We introduced the concept of the repulsiondistance to measure how far a vector is from allthe vectors in a vector set. Suppose there is avector v and a vector set X { x , i 1, 2, , N},the repulsion distance between v and X is definedas: rd , X min d , v v x , here, d ( x, y)is the1i Ndistortion of two vectors. Thus, the algorithm ofgenerating initial codebooks based on themaximum repulsion distance can be described asfollows.(1) Initial state: given the training set ofvectors X { x , i 0,1, 2, , N 1}whose sizeiis N , given the codebook C { y i} whose sizeis m (at the very beginning, m 0 ), and giventhat the size of the final codebook should beM ( M 1);(2) Calculate the centroid c of X ;(3) Find v X , so as to make sured , max d , v c x c . Set C C {}v ,0i Nm m 1 ;(4) Find v Xi , max , 0i Nii, so as to make surerd rd v C rd x C . Set C C {}v ,m m 1 ;(5) If m M , go to (4); otherwise, stop.As other algorithms of generating initialcodebooks, the MRD algorithm can’t get a globaloptimal solution either, but a local optimalsolution. The feasibility of this algorithm can beexplained as follows.Before step (3) is executed, there is only onevector in the codebook. This vector must belongto a certain cell from a division of the training setof vectors. In each iteration, every vector that isselected as the next codevector is the farthest onefrom the codevectors which have already beenselected. Suppose the candidate vector is x , andthe nearest codevector in current small-sizecodebook is y , then the repulsion distancei249


d d x,y . If x and y don’t belong to a samecell, the algorithm gains its end. Otherwise,suppose x and y do belong to a same cell, then thediameter of the cell should be rd d x,yatleast. Because rd is the maximal repulsiondistance, the distance from any other vectors inthe training set to the existing codebook should beless than rd . Then, the “largest” cell is the onecontaining x and y , and it should be split to createmore codevectors in initial codebook. Thus, x isalso a best one to be selected in this iteration.To test the performance of the algorithm, werepeated the two groups of experiments above,with the new algorithm joined in. The initial states,conditions and requirements remained the same.The results are shown in Table 3 and Table 4.However, the MRD algorithm is not able torecognize the noise in the training set of vectors.So, in some untypical cases, when the trainingvectors are distributed in a special way, the MRDalgorithm will tend to select the vectors which arefarthest without considering any other conditions.This may lead to a bad initial codebook.For instance, in a untypical case, the trainingset of vectors consists of 112 randomly generatedpoints on a two-dimensional surface. Among them,100 points are distributed in a circular region withcenter at the point 0, 0 and radius equal to 0.5.However, around this region, 4 groups of noisepoints are located in small circular regions at1, 0 , 0,1 , 1, 0and 0, 1group has 3 points.. EveryTable 3. Results of the 1st groupAlgorithm Times best ConclusionMRD 4764DEL 4116PNN 3430MRD is thebest.Table 4. Results of the 2nd groupAlgorithm Times best ConclusionMRD 4674 MRD is theDEL 3907PNN 5011better thanDEL.Note that, since sometimes two/threealgorithms may perform just the same, theytwo/three can be best in one experiment.We can see from the results above, that underdifferent circumstances, the MRD algorithm isbetter than the DEL algorithm, and sometimes,better than the PNN algorithm. Even in the casesthat PNN performs best, the gap between MRDand PNN is still small.4. DiscussionThe basic idea of the MRD algorithm is,when selecting initial codevectors, selecting thefarthest vector from the existing codevectors sothat all the initial codevectors are in differentcells according to the final division. Thatdecreases both the total iterations needed by LBGand the distortion of the final codebook. Thus agood codebook is generated.In the experiments above, the MRD algorithmperforms satisfyingly. In many cases, theperformance of the codebook generated by theMRD algorithm is pretty good.Figure 3. Training set with noiseLike the experiments above, we repeated10,000 times.The distribution situation of the training set ofpoints as well as the noise points in thisexperiment, drawn with Matlab, is shown inFigure 3. And the result is shown in Table 5.Table 5. Results of the special caseAlgorithm Times best ConclusionMRD 3231DEL 9278 DEL is the best.PNN 3214In such a case, because of the existing of the12 noise points, the performance of the MRDalgorithm suddenly drops, worse than the DELalgorithm.However, in the real world, the probability of250


this case is quite small. So, from a statistical pointof view, the MRD algorithm achieves averagelyhigh performance. Finally, how to filter the noiseand how to improve the performance of the MRDalgorithm under a variety of circumstances, areproblems that deserve further study.5. References[1] LINDE Y, BUZO A, GRAY R M. An algorithm forvector quantizer design. IEEE Transactions oncommunications, 1980, 28(1): 84-95.[2] Sheng Yuanjun, Liu Kejun. An Algorithm of SettingVQ Initial Codebook Based on the Principle of CellDistortion Balance. Journal of Harbin EngineeringUniversity, Vol. 21, No. 5, Oct. 2000.[3] Sun Shenhe, Lu Zheming. Technique andApplication of Vector Quantization. Beijing: SciencePress, 2002.[4] MAKHOUL J, ROUCOS S. Vector quantization inspeech coding. <strong>Proceedings</strong> of the IEEE, 1985, 73(11):1551-1585.[5] GRAY R M. Vector quantization. IEEE ASSPMagazine, April 1984: 4-27.[6] Lee K F. Automatic speech recognition - thedevelopment of the SPHINX system. Boston: KluwerAcademic Publishers, 1989.[7] Abut H, et al. Vector Quantizer of Speech andSpeech-Like Waveforms, IEEE ASSP-30, 1982:423-435.[8] S.P.Lloyd, Least Squares Quantization in PCM,IEEE Trans on Information Theory, Vol.28, 1982, No.2:129-137.251


Investigating the Influence of Population andGeneration Size on GeneRepair TemplatesAmy FitzGeraldDepartment of Computer ScienceNational University of Ireland, MaynoothKildare, Irelandamy.fizgerald@nuim.ieAbstract— In 2005 Lolle et al published controversial findingsshowing that the Arabidopsis thaliana plant repairs invalid geneticinformation using the grandparent as a kind of repair template.We have previously shown how a genetic repair operator(GeneRepair) can be used to correct invalid individuals in anevolutionary strategy. It has been shown that superior results areproduced when the individual’s grandparent is used as the repairtemplate in comparison to using the individual’s parent. Thispaper investigates whether the results produced by GeneRepairtemplates are affected by parameters of population size andnumber of generations. The results indicate that the grandparenttemplate outperforms the parent template regardless ofpopulation or generation size. These findings further supports thecontroversial theory of Lolle et al.Keywords-component GeneRepair; evolutionary strategy; nonmendelianinheritance;population size, generation size, ArabidopsisthalianaI. INTRODUCTIONEvolutionary Strategies (ES) are based on the Darwinianprinciple of survival of the fittest. Their expected success isbased on the fact that they mirror biological evolution. ES havebeen shown to be successful when used on problems with largesearch spaces, but traditional ES are ill-suited to constraintbased problems [1]. Three approaches have been adopted toenforcing constraint upon an ES: penalty points [2], modifiedES operators and genetic repair. In this paper we evaluate amodified ES that incorporates a genetic repair process.In 2005 Lolle et al published a controversial paper showingthat the Arabidopsis thaliana plant used a genetic repair processto correct invalid genetic information [3]. The correctedindividuals appeared to be repaired using informationoriginating in the grandparent generation – information thatapparently by-passed the parent generation. This radical form ofnon-Mendelian inheritance was treated with much skepticism bysome of the scientific community.In this paper we modify a traditional ES to mirror thisgenetic repair technique. We compare the effect ofpopulation and generation size on the results produced, usingboth the parent and grandparent template to repair invalidindividuals. Invalid individuals are those that do not satisfythe constraints of a given problem – the biological equivalentof producing a viable individual/phenotype. For this paper wehave used the TSPLIB eil51 51 city Travelling Salesman’sDiarmuid P. O’DonogueDepartment of Computer ScienceNational University of Ireland, MaynoothKildare, Irelanddiarmuid.odonoghue@nuim.ieProblem (TSP) to evaluate our hypothesis and compareresults. The results show that as the evolutionary strategy ismodified to more closely mirror biology, the grandparentrepair becomes a superior candidate for use as a repairtemplate than the parent.II.7 2 3 4 5 1 1Invalid IndividualGENEREPAIRGeneRepair [4] is an operator which repairs invalidindividuals produced by crossover or mutation in anevolutionary strategy. An example of an invalid individual forthe TSP would be an individual with duplicate cities. TheGeneRepair operator would ensure that this individual satisfiesthe problem constraints by replacing the duplicate city with anymissing city.7 2 3 4 5 1 6Repaired IndividualFigure 1. An Individual with an invalid duplicate gene. . . . . . . . 1 6Repair TemplateAs we can see in Fig. 1 the individual is invalid as it has aduplicate of city 1. GeneRepair is invoked to repair this error byreplacing the duplicate with a missing city. In order to decidewhich city to replace GeneRepair uses a template. Our researchcompares the use of the individual’s parent and the grandparentas possible repair templates.III.RESULTSWe ran an experiment to compare grandparent and parentGeneRepair with a population of ten for five hundredthousand generations on the eil51 TSP to investigate theconsequences of change in generation and population size on252


the success of the GeneRepair templates. We ran thisexperiment 260 times. Our results are presented in twoseparate subsections. In the first subsection we present theresults produced by the 260 runs. We sample these results at100,000, 250,000 and 500,000 generations and at each pointwe compare the use of parent and grandparent templates. Inthe second subsection we analyze the best result producedusing the parent template and compare it to the best resultproduced using the grandparent template. We sample both ofthese files at 50, 500, 5000, 50000 and 500000 generationsand compare the best parent solution and the best grandparentsolution at each point.A. Comparison of Final Results Produced by Parent andGrandparent TemplatesIn previous research [5] we have shown how the grandparentcan be used as a template for repair in an ES to successfully findnear optimal solutions to constraint based problems. In thispaper we investigate whether change in population size ornumber of generations has an effect on those findings. We haveused a population of ten for this set of experiments which is atenth of the population used in our previous publication. Themutation rate set to two for all experiments. This mutation rateis based on the findings from the investigation carried out inmutation rates by Mitchell [6].In this first set of results we compare the final resultsproduced by each of the 260 experiments carried out using theparent template to repair invalid individuals to the 260experiments carried out where the grandparent template wasused. Each of the experiments was run for 500,000 generations.In Fig. 2 below you can see that each of the grandparentexperiments produced a smaller tour length and so a betterresult that its parent counterpart.Figure 3. Results for 260 experiments after 500,000 GenerationsIn Fig. 4 below we can see that the gap between the resultsproduced when using the parent and grandparent templage hasgrown when the results were sampled at 250,000 generations.The grandparent produces superior results to its parent templatefor every experiment. You can also see that the best resultsproduced by the grandparent, as highlighted in Fig. 4 aresignicantly better than those produced by the parent template.Figure 4. Results for 260 experiments after 500,000 GenerationsFigure 2. Results for 260 experiments after 500,000 GenerationsWe went on to sample each of these results at 100,000 and250,000 generations. In Fig. 3 the result of each experiment isshown after 100,000 generations. It can be clearly seen thatthere is a significant gap between the tour length produced usingthe grandparent template and the tour length produced using theparent template for the majority of the results.If we analyze the results produced by the experiment asillustrated in Fig. 2, it is clear that grandparent is superior toparent as a candidate for a GeneRepair template even though thepopulation size was one tenth of the size in previously publishedresults. By not only looking at the final result but also samplingthe experimental results at two separate points we can see thatthe grandparent template produces superior results to the parenttemplate regardless of generation size. If we go on to analyzethese results further we can see that while the averagegrandparent has a lower tour length than the average parent thestandard deviation across the results produced by thegrandparent template is much wider (See Table 1). Perhaps it isthis diversity within the grandparent that allows us to producebetter results through wider exploration of the search space.253


TABLE I.EXPERIMENTAL RESULTS FOR 260 RUNSTemplateTour LengthStandardAverageDeviationParent 799.2769 16.06984Grandparent 796.2654 19.63673B. Analysis of Best Parent and Best Grandparent ResultsThis second section of our results compares and analyzes thebest result produced when using the parent template comparedto the best result produced when using the grandparent templateas highlighted in Fig. 2. We have sampled these results at fivedifferent points in the experiment to compare parent andgrandparent GeneRepair at five different generation sizes. Theresults in Fig. 5 compare best results produced by bothgrandparent and parent GeneRepair after 50 generations. Thelines in the graph below indicate the tour length (Y-axis)produced for the eil51TSP at that particular generation (Xaxis).We can see that only after 50 generations a small butsignificant gap has formed between the use of grandparent andparent templates. We can also identify that using the parenttemplate has caused the decrease in tour length to somewhatplateau in comparison to the steady improvement of resultswhen using the grandparent template.Figure 6. Results after 500 GenerationsWe have sampled the experiment at 5 different points; 50,500, 5000, 50,000 and 500,000 generations. The only time thatparent produces better results is at 5,000 generations but even atthis point the difference between grandparent and parent isminimal in comparison to the difference between them at eachof the other sampled points. (See Table 2 & Fig. 7)Figure 7. Results after 5,000 GenerationsFigure 5. Results after 50 GenerationsIn Fig. 6 below we can see that the gap between the resultsproduced by the grandparent and parent template has grownsignificantly after 500 generations. Grandparent appears tocontinue to evolve while once again the parent template hascaused a plateau in the results.Figure 8. Results after 50,000 Generations254


When the results are next sampled at 50,000 generationsgrandparent is significantly better than parent again whichconcretes the finding that the result at 5,000 generations was adefinite outlier for the parent template in comparison with theconstant positive results for the grandparent template (Fig. 8)The last sample of the results was taken at 500,000generations and the gap between the parent and grandparenttemplate has once again grown. The final results shown in Fig. 9illustrate that on completion of the experiment the grandparenttemplate has produced a significantly lower tour length than theparent template.Figure 9. Results after 500,000 GenerationsTABLE 2 provides a summary of the results as sampled atthe five points explained above. We can see that the differencebetween the grandparent and parent template becomes moreapparent as you increase the number of generations in theexperiment. Not only is grandparent a superior repair template,it is impervious to both population size and the number ofgenerations. We can see that it is resistant to changes in thepopulation as this experiment gives the same conclusion aspreviously published results [5] even though the populationused for the results shown in this paper is a tenth of what it wasfor previous experiments. We have also shown that it isresistant to changes in the number of generations by samplingthe results at five different points and comparing the parenttemplate to the grandparent template at each of these points (SeeTable 2).TABLE II.Number of GenerationsFINAL RESULTSParentRepair TemplateGrandparent50 Generations 1098 1092500 Generations 1028 9605000 Generations 839 84250000 Generations 837 816500000 Generations 752 685IV.CONCLUSIONPrevious results [5] have illustrated that the grandparent is asuperior repair template to the parent when used by GeneRepairin an ES. This paper goes on to show that the superiority of thegrandparent repair template is invariant across many populationand generation sizes. The population used for the experimentsillustrated in this paper was set to 10 in comparison with apopulation of 100 in previously published results [5]. The final260 experiments were analyzed at three different generationsizes. The results showed that at each of the sampled points thegrandparent template outperformed the parent as a repairtemplate. The results also showed that the grandparent templatecontinuously produced superior results and so at each sampledpoint in the experiment it was a new best result produced by thegrandparent being compared to the parent. The grandparentcontinually produced superior results as opposed to one superioroutlier.We then went on to examine the best result produced by theparent template in comparison to the best result produced by thegrandparent template. The experiment was sampled at fivedifferent generation sizes and four out of five of these sampleswere positive that that the grandparent is a superior template.The one sample that was not positive was too weak to suggestthat the parent template was superior. It is also to be expectedwith a stochastic method such as ES that a small proportion ofthe results will be unreliable due to the strong influence ofoutliers on the overall results.The results presented in this paper strongly support thecontroversial findings of Lolle et al [3] where they show that inthe biological environment of the Arabidopsis thaliana plant thegrandparent template is successfully used as the repair templateto correct invalid genetic information. This paper concludes thatnot only is it possible to use the grandparent as a template forrepair, it is shown to be superior to using that of the parentregardless of population and generation size.ACKNOWLEDGMENTSThe research of Amy FitzGerald carried out as part of her PhDis funded by IRCSET. The authors acknowledge the CS Dept ofNUIM for access to the CREIG cluster to run the above.REFERENCES[1] Coello Coello, C., “Theoretcial and Numerical Constraint HandlingTechniques in Evolutioanry Algorithms: A Survey”, Comp. Methods inApp. Mathematics and Engineering,Vol 191(11,12),pp.1245–1287 (2002)[2] Kalyanmoy, D., “An Efficient Constraint Handling Method for GeneticAlgorithms”, Comp. Methods in Applied Mechanics & Engineering, Vol186(2-4), pp. 311-338 (2000)[3] Lolle, S.J., Victor, J.L., Young, J.M., Pruitt, R.E., “Genome-wide nonmendelianinheritance of extra-genomic information in Arabidopsis”Nature, vol.434, pp. 505–509, 2005[4] Mitchell, G.G., O’Donoghue, D.P., Trenaman, A., “A New Operator forEfficient Evolutionary Solutions to the TSP” Applied Informatics vol. 0-88986-280-X, pp. 771-774, 2000.[5] FitzGerald, A., O’Donoghue, D.P. “Genetic Repair for Optimizationunder Constraints”,10 th Intl Conference on Parallel Problem SolvingFrom Nature (PPSN2008),Dortmund,Germany,pp399-408, 2008.[6] Mitchell, G.G., “Evolutionary computation applied to CombinatorialOptimization Problems”, PhD Thesis, Dublin City University, DublinIreland, 2007255


Intelligent Learning Systems Where are They Now?George G. Mitchell, Colm P. HowlinResearch and Development Laboratory CCKF Ltd.,Tallaght, Dublin 24, Ireland.{george.mitchell, colm.howlin} @cckf-it.comAbstract— This paper documents the work in progress increating a new Intelligent Learning System. The paperintroduces the issues of relevance when developing such systems.It examines briefly the state of the art followed by keyapproaches to addressing the major research questions in thefield. We introduce the Realise it system outlining the philosophybehind the system. and also the novel approach which we aretaking to its implementation. Following this we explore the as yetunanswered research questions which must be addressed inorder to develop what we believe is a truly intelligent learningsystem; one which can appropriately and intelligently guidelearners (students) along their individual learning pathsthroughout their lives.Keywords— Intelligent Learning, ITS, Artificial Intelligence,Data Mining, Knowledge Management.I. INTRODUCTIONOver the last twenty years we have seen an explosion in thenumber of computer based training systems (also: VLE, LMS,LCMS, ITS and e-learning). With the internet we have aplethora of web based learning systems, all of which canutilise a variety of ways to present the material to students:text, graphics, audio and movies. The majority of educationalinstitutions at third level have now implemented some form ofmanaged learning environment comprising a backbone VirtualLearning Environment [1] of Moodle, Blackboard, Caroline,etc. These systems essentially provide knowledgemanagement of the educational content, assessment and also alevel of administrative assistance. To increase theattractiveness of these systems to learners, over the last fewyears these systems have embraced new educationaltechnologies. The use of this technology has long beenquestioned, particularly around issues of relevance such as - towhat extent do these new technologies assist the student ingaining the required knowledge?A number of education technology researchers haveinvestigated this issue, for example Clark and Mayer [2] haveinvestigated an interaction enhanced e-Learning system(using natural language, first and second person pronouns forassessment and feedback). Their approach to using noveltechnology was benefit driven, not just because it wasavailable but because following scientific investigation it wasfound to be superior to other (evaluated) approaches.However, adding new technology to a learning system thatdoes not contain any real intelligence raises many questions.Many of the aforementioned VLEs utilise homogeneouslearning paths for all learners. These systems achieve aneconomical educational approach both in terms of finance andeducational benefit to all students. More flexible approacheswhich adapt to individual learners’ needs have been attempted.One of the primary research areas is Intelligent LearningSystems [3, 4] (sometimes referred to as Intelligent TutoringSystems). This is an area of AI research which fundamentallyhighlights where cognitive interaction of man and machine arecritical for the best performance of both systems.Some researchers have suggested that further work is requiredto improve issues around homogenised e-Learning systemstogether with poor support and feedback for the individuallearners.II.INTELLIGENT TUTORING SYSTEMAn Intelligent Tutoring System (ITS) as defined by the AAAI[5] is “educational software containing an artificialintelligence component. The software tracks students’ work,tailoring feedback and hints along the way. By collectinginformation on a particular student’s performance, thesoftware can make inferences about strengths and weaknesses,and can suggest additional work.” The AI element is the keycomponent to individualised learning systems. Over the lastten years a number of different systems have been developedwhich provide intelligent tutoring. These systems, however,have developed along very narrow lines in terms of subjectarea and educational objectives. An example is algebra forfirst year university Mathematics students. Tom Murray asignificant researcher and author in the ITS and adaptivelearning systems field, was critical that the ITS field lackedvision [6]. In 1999 he summarised the focus of research inITS as applying to: particular subject domains (like algebra orcalculus), learning environments which embodied either aninstructional metaphor or a traditional curriculum metaphoronly and those which were pedagogy or performance orientedsolely. His 1999 summary of the state of the art sadly remainslargely unchanged today. Murray identified a number of areaswhich he felt were important for future research in order tomove the field forward. He noted that these were neglected inthe work up to 1999, and today they remain largelyuninvestigated.256


One central element of research which has been identified asrequiring significant exploration is the human computerinteraction in learning systems. How information is presentedto the learner is a complex problem, requiring the direct inputfrom a number of interdisciplinary specialists such as: HCI,Educationalists (both pedagogical and androgogical),Cognitive Science and Software Engineering.III. MAN & MACHINEUnderstanding the needs of both man & machine is criticalwhen creating an effective intelligent and adaptive learningsystem. It is relatively straightforward to assess the state ofknowledge in an artificial system. However, it is somewhatmore complex to assess the current state of knowledge in ahuman learner. From a human psychology perspective,through profiling and observation of the individual, anestimation of the knowledge profile of the user is possible.The degree of belief one places in this is influenced by anumber of correlated factors including: the level of profiling,and the techniques used in this profiling [7]. Effective userprofiling is closely associated with the development of ahighly coupled human computer interaction [8]. Many studieshave identified differing profiling techniques (e.g. learningstyle profiles [9, 10, 11]), all of which can profile particularelements of the user, using questions ranging from 10 innumber to 1000 or more. New approaches that have been usedto successfully assess the interaction of users and systems,include those which combine traditional profiling withtheories of Cognitive Load [12] on a student as he/sheinteracts with the system. Measuring passive interactions withthe systems through various techniques such as head and eyetracking, or EEG, can assist in improving our understandingof the HCI process and what most suitably works withindividual learners [13].IV. REALISE IT INTELLIGENT LEARNING SYSTEMSThe Realiseit Intelligent Learning System is currently a workin progress, (see Figure 1, which depicts a typical view in thesystem). A number of key modules have been completedincluding the Artificial Intelligence Engine (AIE), see Figure2. The AIE is tasked with providing the differing systemmodules with the specific intelligence to dynamically modifyand develop each student’s learning path.The approach builds upon the current state of knowledgeconcerning expert systems. This is achieved through the use ofevidence based decision (EBD) making. Our EBD approachutilises proactive and passive assessments coupled withprofiling to create an evidence profile which can feed into aninnovative blend of allied algorithms such as: BayesianReasoning [14], Simulated Annealing [15], InformationTheory [16], Graph Theory [17], Fuzzy Logic [18],Classification Trees [19], Nonlinear and Ordinal regressiontechniques [20].The Realise it system attempts to embrace the issues raised byMurray. The learning system is enhanced with a blend ofmethodological approaches including instructional metaphor,traditional curriculum metaphors, pedagogy-oriented andperformance oriented metaphors, all united in a singleenvironment. This approach facilitates the larger objective ofenabling the system to be applicable across a variety ofdiffering subject areas such as Math, Science, Languages,Business, History and Social Sciences.Figure 1Screenshot of Realise-it Intelligent Learning System257


FuzzyLogicBayesianReasoningClassificationTreesintelligent flexibility to an individual learner’s desiredpedagogical (or androgogical) approach.Trials of the system are currently underway and it is expectedthat further publications will document the results of thesetrial when they have been completed (expected later in <strong>2009</strong>).Figure 2GraphTheorySimulatedAnnealingRealise itAI EngineInformationTheoryOrdinalRegressionArtificial Intelligence EngineNonlinearRegressionThe primary goals of the Realise it project that have beenachieved so far provide the following core design features:• Multi disciplinary learning approach. The systemprovides a single environment for the student to learn a broadrange of subjects such as Languages, Math, Science andHistory. A learning objective for one individual may includelearning items drawn from different subjects. This novelapproach is as a result of trials which have shown thatindividual student learning paths typically require skills drawnfrom a broad range of disciplines.• Individual dynamic learning paths. Each individuallearner has a dynamic learning path for their particularlearning objective. The learning path is integral to thenavigation through the educational material and is consulted(and possibly modified) continuously during a learningsession. This technique is one of our approaches to ensure thatthe user can always determine “where am I?” in relation totheir progress towards intended learning objectives.• Focus on individual learning and appropriate learningdesigns. The system is designed to reduce the technologynoise that is often present in eLearning applications. This isparticularly acute when significant amounts of valuable“learning time” are devoted to solving technology problems,instead of addressing learning problems.• Flexible and adaptable learning environments. Thesystem has been designed to embrace Cognitive load theory[21] and its impact upon Learning Design, creating a modernflexible learning system. Using the Learning Designapproaches and standards of IMS Global (IMS-LD) [22] as afoundation, our system is self-intelligent. This providesV. FUTURE WORKAt present we have identified two key areas of furtherresearch which we expect will significantly enhance theunderstanding of intelligent learning systems. Initially we willfocus on how intelligent learning systems can become evenmore intelligent and how they can best be utilised in anefficient man and machine manner. To achieve this a morefull understanding of the profile of both the learner and thecontent is required. We will focus on:• Behavioural assessment & modelling• Cognitive assessment & modelling• Learning style assessment & modellingThis will enhance our approach to developing a suite of toolsto accurately instruct and assess the individual learner across arange of differing specialist fields, subject areas and multidisciplinary learning path objectives. This will ensure that thesystem continuously adapts (a) assessment methods, (b)content delivery vehicles and (c) the learning paths for a widerrange of subjects and individual learners.A second key area of work is how to fully embraceinteroperability and usage of differing educational content.Content presentation and navigation is tightly coupled withthe advantages of technology drive education and alsocognitive loading. Improper usage of technology may purelydivert attention away from the learning objective. Equally it islikely that cognitive profiling may indicate where novel usageof new technology may be more effective, for example withspecial case learners such as those who have an intellectualdisorder or an attention deficit disorder.It is the marrying of cognitive psychological techniques andartificial intelligence techniques that, we suggest, will bringabout the next generation of intelligent learning systems.VI. SUMMARYIn this paper we have briefly explored the work in progress increating the Realise it Intelligent Learning System. The paperintroduced the issues of relevance in developing such a system.It examined briefly the state of the art followed by keyapproaches to addressing the major research questions in thefield. We introduced the Realise it system outlining thephilosophy behind the system and also the novel approachwhich we are taking to its implementation. Following this weidentified the outstanding research questions which must beaddressed to ensure a truly intelligent learning system whichcan appropriately guide learners along their individuallearning paths.258


REFERENCES:[1] Anderson, T. (Ed.) (<strong>2009</strong>), Theory and Practice ofOnline Learning, 2nd Edition, Athabasca University Press,Edmonton, Canada.[2] Clark, R.C., Mayer, R.E. (2003), E-Learning and theScience of Instruction: Proven Guidelines for Consumers andDesigners of Multimedia Learning, Jossey-Bass/Pfeiffer, SanFrancisco, CA, USA.[3] Weber, G. and Specht, M. (1997) User modeling andadaptive navigation support in WWW-based tutoring systems,Proc. 6th International Conference on User Modeling, pp.289–300.[4] Brusilovsky, P., Karagiannidis, C., and Sampson, D.(2004), Layered Evaluation of Adaptive Learning Systems.International Journal of Continuing Engineering Educationand Lifelong Learning 14 (4/5) 402 – 421.[5] http://www.aaai.org/AITopics/pmwiki/pmwiki.php/AITopics/IntelligentTutoringSystems[6] Murray, T., Authoring Intelligent Tutoring Systems:An Analysis of the State of the Art, International Journal ofArtificial Intelligence in Education (1999), 10, 98-129.[7] Gasparetti, F., & Micarelli, A., (2005) User ProfileGeneration Based on a Memory Retrieval Theory,<strong>Proceedings</strong> of the 1st International Workshop on WebPersonalization, Recommender Systems and Intelligent UserInterfaces (WPRSIUI 2005), pp 3-7.[8] Gray, W. D., Sims, C. R. & Schoelles, M. (2005).Cognitive Metrics Profiling. In. 49th Annual Conference ofthe Human Factors and Ergonomics Society. Santa Monica,CA: Human Factors and Ergonomics Society.[9] Zhang, L. (2003). Does the big five predict learningapproaches? Personality and Individual Differences, 34,pp1431–1445.[10] Bidjerano, T., & Yun Daia, D., (2007). Therelationship between the big-five model of personality andself-regulated learning strategies. Learning and IndividualDifferences 17(1), pp 69-81.[12] Clark, R. C., Nguyen, F., & Sweller, J. (2006).Efficiency in learning: Evidence-based guidelines to managecognitive load. San Francisco: Pfeiffer.[13] Salmerón, L., Baccino, T., Cañas, J.J., How PriorKnowledge and Text Coherence Affect Eye Fixations inHypertext Overviews, The 28th Annual Conference of theCognitive Science Society (CogSci 2006) pp715-719.[14] Gerenzer G., Hoffrage U., (1995) How to improveBayesian reasoning without instruction: frequency formats.Psychological Review 102, pp. 684-704.[15] S. Kirkpatrick, S., Gelatt, C.D., Vecchi, M.P., (1983)Optimization by Simulated Annealing, Science, New Series,Vol. 220, No. 4598, pp. 671-680.[16] MacKay,D.J.C., (2002) Information Theory,Inference & Learning Algorithms, Cambridge UniversityPress, New York, NY.[17] West, D.B., (2001). Introduction to Graph Theory,second ed. Prentice-Hall, Upper Saddle River, NJ.[18] Zadeh, L.A., (1988) Fuzzy Logic, Computer, v.21n.4, p.83-93.[19] Rokach, L., Maimon, O., (2008) Data Mining withDecision Trees Theory and Applications, Series of MachinePerception and Artificial Intelligence Vol 69, World ScientificPublishing, Singapore.[20] Harrell, F.E., (2001) Regression Modeling Strategieswith Applications to Linear Models, Logistic Regression, andSurvival Analysis, Springer Series in Statistics.[21] Dror, I. E., Schmitz-Williams, I.C., & Smith, W.(2005). Older adults use mental representations that reducecognitive load: Mental rotation utilises holistic representationsand processing. Experimental Aging Research, 31(4) 409-420.[22] IMS Learning Design specification,http://www.imsglobal.org/learningdesign/ (last accessed 29May <strong>2009</strong>).[11] T. Chamorro-Premuzic, A. Furnham (2008).Personality, intelligence and approaches to learning aspredictors of academic performance. Personality andIndividual Differences 44, pp1596–1603.259


AN IMPROVED HAPTIC ALGORITHM FOR VIRTUAL BONE SURGERYDenis Stack, Joe ConnellCork Institute of TechnologyABSTRACTWith the rapid development of virtual environments as partof surgical training in bone surgery, the specific challengeof feedback to the haptic device controlling the virtual drillcontinues to receive attention. This paper reviews the evolutionof relevant haptic algorithms and measures the relativeperformance of those which are proxy and non-proxybased using real force data. An improved proxy-based feedbackalgorithm is also presented for use in a temporal bonesurgery simulator. This new algorithm uses a spherical proxywhich prevents the dynamic object, drill burr, from passingthrough gaps smaller than its own size.1. INTRODUCTIONTraditionally, the milling procedure for temporal bone surgeryhas been taught to students through demonstration and practiceon cadaveric specimens. Variations in the supply ofthese specimens coupled with technical advancements in visualisationhave contributed to the development of virtualtemporal bone surgery simulators. Examples include [1],[2], [3] and [4]. Simulators can also offer the facility ofintegrating patient-specific data. This means surgeons canperform a dry run prior to actual surgery. Central to a simulator’seffectiveness is the handheld or haptic device whichtakes the place of the surgical drill. It is imperative thatthe user experiences through haptic feedback forces consistentwith those experienced during a real drilling procedure.This is one of the bigger challenges in this area.Typically, in a bone milling simulator, the underlyingdata derives from CT scans and is rendered on screen in 3D.Forces must be calculated based on the interaction betweenthe virtual drilling tool and the voxels representing bone.This is in contrast to traditional haptic methods which calculatedforces based on explicit surfaces.The approach to calculating these forces can be broadlydivided into two main categories: proxy-based and non proxybased.Typically, proxy-based algorithms calculate the outputbased on a spring-damper force between the proxy positionand the user effector position, [2], [5] and [6]. Nonproxy-based algorithms calculate haptic forces based on theintersection between virtual drill burr sphere and voxels,e.g. [3] uses a system of virtual springs, while [1] and [7]use algorithms based on sample points on the drill tool.In this paper, an improved proxy-based haptic algorithmis presented which maintains proxy-surface contact whilepreventing the proxy becoming immersed within the volume.It uses a new, more complex method of force calculationwhich achieves greater accuracy when comparedwith real-world force data. A second, non-proxy based algorithmsimilar to [1] is also presented, and the online hapticdata repository discussed in [1] is used to evaluate bothalgorithms.The paper is organised as follows: Section 2 discussesthe evolution of haptic algorithms relevant to bone surgery,Section 3 describes the proposed proxy-based and non proxybasedalgorithms in detail, Section 4 presents results andperformance evaluation and Section 5 outlines future work.2. EVOLUTION OF RELEVANT ALGORITHMSMcNeely et al [8] used a voxmap-point shell algorithm toprovide haptic feedback, in which the dynamic object is representedby surface sample points. In each haptic frame allpoints are checked for intersection with voxels from the objectbeing explored. If a point is found to be in contact withthe object then a normal is traced towards the centre of thedynamic object. All normals are then summed and scaled togive an overall force vector.Bryan et al [3] projected virtual springs outwards fromthe surface of the drill burr sphere. These form a force fieldwhich strives to maintain a minimum separation distancebetween burr and bone. The limitation here is that onlybone surface information is used. Volumetric informationis not used which rules out the possibility of varying forcesaccording to bone density.Petersik et al [7] developed a haptic interaction algorithmbased on 26 pre-computed sample points on a sphericaldrill surface. A drawback with non proxy-based algorithmssuch as this is that a sufficiently forceful push by theuser can cause the virtual burr to become completely immersedin the bone. To counteract this a basic proxy methodwas implemented for use only when immersion occured.Lundin et al [5] developed a new, point-based, proxy algorithmfor direct volume haptics. Material properties such260


as stiffness, friction and penetrability were included. As explicitsurfaces do not exist in a volumetric representation ofan object, a local surface was extracted in each haptic frame.Vidholm et al [6] presented a proxy-based algorithmbased on a combination of [5] and [7]. A spherical proxyprevents the burr passing through small holes in the volume.However, when exploring complex surfaces, the sphericalproxy can become completely immersed within the volumecausing the algorithm to fail.Eriksson et al [2] modified an algorithm based on thatin [6] to address the immersion problem. Using an iterativeapproach to keep the proxy in contact with the surface ofthe object, and thus never completely immersed, the proxy’scentre is maintained on the surface of the volume. The disadvantageis that the algorithm behaves as if the proxy werea point and therefore the haptic probe can pass through tinygaps in a volume.Morris et al [1] present a method similar to [7] exceptthat the spherical drill burr is discretized using evenly-spacedvolume sample points rather than surface samples only.The algorithms discussed above can be divided into twocategories: proxy-based, [2] [5], [6] , and non-proxy based,[1], [3], [7]. The three non proxy-based algorithms were developedusing a Phantom Premium haptic device [9] whichhas improved stiffness over the lower-end Phantom Omnidevice [9].Until Morris et al [1], there were no concrete standardizedmethods for evaluating haptic algorithms. Now, an onlinedata repository allows haptic algorithm developers tocompare output with real-world data collected from a forceprobe. This proves extremely useful for comparing the relativeperformance of algorithms and it is used here to comparethe non proxy-based algorithm in [1] with the proposedimproved proxy-based algorithm. The algorithm in [1] representsthe most recent development in the non proxy-basedclass of haptic bone surgery algorithms.3. HAPTIC ALGORITHMSThis section describes the two haptic algorithms chosen forimplementation and performance evaluation using a PhantomOmni.3.1. Non Proxy-Based AlgorithmIn each haptic frame, bone voxels which are found to intersectwith the drill burr sphere contribute individual unitforces which act to push the drill burr out of the volume. Theunit forces are then summed to yield an overall force vectorwhich is then scaled down to prevent instability causedby a large initial contact force. The bone voxels were useddirectly to compute the force, which with adequate voxelresolution is sufficient.3.2. Improved Proxy-Based AlgorithmSimilar to [2] and [6], this algorithm uses a spherical proxywhich remains in contact with the bone surface as the usereffector penetrates the volume. A local surface is generatedin each haptic frame, to which the proxy is bound, and thusthe proxy can follow complex surfaces.In [2], [5] and [6], the force sent to the haptic device iscalculated using the well-known Hooke’s law F = −kx fora virtual spring connecting the user position to the proxy position,where x is user penetration distance and k is a springconstant. This is in contrast with the non proxy-based algorithmdiscussed in Section 3.1, in which output force varieslinearly with penetration volume into the immersed portionof the virtual burr. It was found by comparison with realworldforce data that this volume approach provided moreaccurate results, as shown in Section 4. In this improvedproxy-based algorithm, instead of using Hooke’s law, forceis varied as if there were a sphere penetrating the volume bya distance equal to that between the proxy and user effectorthus mimicking the non proxy-based algorithm.For a sphere intersecting a planar surface, the volume vof the intersecting sector isv = π 3 h2 (3r − h) (1)where r is the radius of the sphere and h is the penetrationdistance into the volume. Initially, output force was variedaccording to Equation 1 and scaled down by a constant, a.This constant may be varied to adjust the stiffness of a surface.The radius r of the virtual sphere should be chosen tobe more than a few voxels. A larger r will simply scale upthe resultant force linearly.To remove the simplification of a local surface beingplanar, the output of Equation 1 must again be scaled bythe ratiob = intersectingvolume(2)sectorvolumewhere intersectingvolume is the total volume of voxelsintersecting the proxy sphere, and sectorvolume is the volumeof the proxy sphere which would have intersected theobject volume had the surface been planar. The total outputforce in a given haptic frame is thusF = a ∗ b ∗ π 3 h2 (3r − h) (3)When the user effector is in free space, proxy positionand user position are identical. When contact with the boneis detected, the proxy position is then set and constrained tothe surface of the bone while the user effector penetrates thevolume. Local surface gradients are calculated by checkinga spherical volume around the proxy centre position equalto the virtual drill burr volume. Each bone voxel found tointersect the proxy sphere contributes a unit length vector261


Fig. 2. Proxy ImmersionFig. 1. Proxy Movementwhich is traced towards the centre of the proxy. These unitvectors are summed and the result normalized to give thelocal gradient vector.In each haptic frame, the proxy moves along the surfaceplane by an amount equal to the projection of the vectorbetween proxy and effector onto the surface plane. This isillustrated in Figure 1. Since a new proxy surface is constructedin each frame, deformation of bone does not adverselyaffect the algorithm.When exploring concave surfaces using this algorithm,the proxy will inevitably become immersed within the bonevolume thus causing algorithm failure. As haptic framesare executed, the increasing number of proxy contact voxelscan cause the bone to eventually consume the proxy, asillustrated in Figure 2. This problem is also identified in[2]. It is also possible that the proxy may ‘veer off’ a convexsurface thus losing contact with all voxels and resultingin failure to calculate a local surface gradient.To prevent the proxy from becoming immersed withinthe volume, two zones are defined within the proxy sphere,as illustrated in Figure 3. While contact with some surfacevoxels is required to calculate a surface gradient, deep immersionwithin the bone must be avoided. Zone A in Figure3 is the permitted zone for contact voxels; zone B is theprohibited zone. If the proxy reaches a particular positionwhich results in bone voxels in zone B then the algorithmautomatically adjusts the proxy position to empty zone B ofvoxels. Conversely, if exploring a convex surface, the proxymay veer off the surface thus losing contact with all bonevoxels, even though the user effector is still inside the volume.To prevent this, the proxy position is adjusted in eachframe to keep Zone B close to bone voxels so that there willalways be bone voxels in Zone A.Surface friction is implemented in a manner similar to[5].Fig. 3. Proxy Zoning4. EVALUATION AND RESULTSBoth algorithms were implemented on a temporal bone surgerysimulator using SensAble’s Open Haptics toolkit [10] anda Phantom Omni device. This toolkit facilitates low-levelcommunication by directly sending forces and reading deviceposition. Simulator visualization used a combinationof VTK and OpenGL. The simulation was run in WindowsXP on Intel Core Duo E8600 CPUs running at 3.33 GHzWindows XP.The online haptic data repository in [1] was used tocompare both haptic algorithms’ output with real-world forcedata. For each haptic frame, a root-mean-squared methodwas used for accuracy evaluation. The overall normalizedRMS error e is given ase =1N∑ Ni=1 |F r i − F a i ||F r i max|where N is the number of force trajectory samples, F r i isthe real force at sample i, F a i is the algorithm’s output forceat sample i and F r i max is the magnitude of the maximumreal output force.(4)262


Table 1. Algorithms’ PerformanceAlgorithm Improved Proxy Hooke Proxy Non ProxyNormalized RMS Error (%) 9.96 13.08 9.90Frame Time (ms) 0.376 0.374 0.135Table 1 shows the results obtained for an input trajectoryof 4,000 samples. The accuracy of the improved proxy andnon-proxy based algorithms is comparable. Average executiontimes per haptic frame are also shown.useful for evaluating haptic algorithms exploring the surfaceof an object, a method for comparing the haptic forces involvedwhen drilling bone to real-world data is still needed.6. REFERENCES[1] Morris, D.: Haptics and Physical Simulation for BoneSurgery. PhD Thesis, Stanford University. (2006)[2] Eriksson, M.: Haptic and Visual Simulation of a MaterialCutting Process. Licentiate Thesis, TRITA-STHReport 2006:03 ISSN 1653-3836 ISRN/STH/–06:3–SE (2006)[3] Bryan, J.: A Virtual Temporal Bone Dissection Simulator.Master’s Thesis, Ohio State University (2001)[4] Agus, M., Giachetti, A., Gobbetti, E., Zanetti, G., Zorcolo,A.: A Multiprocessor Decoupled System for theSimulation of Temporal Bone Surgery. Comput. VisualSci. 5: 3543 (2002)Fig. 4. Real World Forces (solid) vs. Non-Linear ForceCalculation (dashed) vs. Hooke’s Law Force Calculation(dash-dot)5. CONCLUSION AND FUTURE WORKAn improved proxy-based and a non-proxy based algorithmwere implemented and their performance was measured againstreal-world force data with both achieving comparable accuracy.The improved proxy-based haptic algorithm offersthe potential advantages of controlling material parameterssuch as friction, stiffness, viscosity and penetrability as in[5]. These properties are difficult to implement withouta proxy. When the non proxy-based algorithm was usedwith the Phantom Omni, the maximum bone stiffness whichcould be achieved without instability was poor, even whenemploying averaging methods such as in [7] and multi-gainmethods such as in [1]. The improved proxy-based algorithmachieves greater stiffness without instability. Anotheradvantage of using a proxy-based algorithm is that the popthroughproblem of non proxy-based algorithm is eliminated.Future work will examine the effect of surface friction onthe accuracy of the proposed proxy-based algorithm. Also,while the haptic evaluation method due to Morris et al is[5] Lundin, K., Ynnerman, A., Gudmundsson, B.: ProxybasedHaptic Feedback from Volumetric Density Data.Proc. Eurohaptics, pp. 104-109. (2002)[6] Vidholm, E.: Visualization and haptics for interactivemedical image analysis. Digital Comprehensive Summariesof Uppsala Dissertations from the Faculty ofScience and Technology 386. 83 pp. Uppsala. ISBN978-91-554-7067-8. (2008)[7] Petersik, A., Pfessler, B., Tiede, U., Hoehne, K.H.,Leuwer, R.: Haptic Volume Interaction with AnatomicModels at Sub-Voxel Resolution. Proc. 10th Symp. OnHaptic Interfaces For Virtual Envir. and TeleoperatorSysts. 0-7695-1489-8/02 (2002)[8] McNeely, W. A., Puterbaugh, K. D., Troy, J. J.: SixDegree-of-Freedom Haptic Rendering Using VoxelSampling. Proc. 26th Conference on Computer Graphicsand Interactive Techniques pp. 401 - 408 (1999)[9] SensAble Technologies,http://www.sensable.com/products-hapticdevices.htm[10] OpenHaptics, http://www.sensable.com/openhapticsacademic-edition-faqs.htm263


1Corpus Design Techniques for Irish SpeechSynthesisAmelia C. Kelly, Harald Berthelsen, Nick Campbell, Ailbhe Ní Chasaide, Christer GoblPhonetics and Speech Laboratory, SLSCS, Trinity College Dublin, Irelandkellya16@tcd.ie, berthelh@tcd.ie, nick@tcd.ie, anichsid@tcd.ie, cegobl@tcd.ieAbstract—Unit selection is a data-driven approach to speechsynthesis that concatenates pieces of recorded speech from alarge database in order to create novel sentences. Many corporaare available in the English language, including the Arcticdatabase [1], which allows a user to create small, reliable speechsynthesisers using only a small set of recorded sentences. Suchresources for minority languages are scarce however, despite theirincreasing importance for the survival of such languages. Thispaper describes the current research in creating efficient Irishlanguage corpora for speech synthesis. Corpus design techniquesare discussed, in particular, two methods of data reduction thatare applied to an aligned spoken corpus of Irish in order tocreate smaller, more efficient speech corpora.Index Terms: speech synthesis, corpus design, Arctic, IrishI. INTRODUCTIONThe unit selection method of speech synthesis is a datadriven,concatenative technique that draws on a large databaseof recorded speech, from which it can select speech segmentsand join them together to create novel utterances. The contentof the speech database, or corpus, is vital to the performance ofthe synthesiser on which it is built. Naturally it is impossibleto create a corpus that contains every speech sound in thelanguage in every context in which it can be spoken, howeverthe use of an overly large corpus will significantly slow downthe performance of the synthesiser. As a result there existsa trade off between the quality and the performance of thecorpus with respect to size. The Arctic database for Englishis an example of a corpus has been designed to address thistrade-off, and is intended to be as compact as possible, whilestill containing the greatest diversity of linguistic units in avariety of contexts. The benefit of such a corpus is that itis freely available for download and can be used with theopen-source Festival Speech Synthesis System [2] to create apersonal speech synthesiser.Despite the advances in speech technology, resources remainscarce for endangered minority languages that haveexperienced a decline in the number of native speakers. Irishis an example of such a language that has fallen furtherbehind in technology development due to the lack of resourcesavailable and to the lack of commercial incentive. Sincespeech technology has a crucial role to play in educationand accessibility, speakers of minority languages are becomingparticularly disadvantaged [3].This study aims to address the scarcity of resources forthe Irish language by creating an Irish speech database that,like the Arctic database, is recorded by a number of speakers264and freely available for download. This short paper outlinesthe approach adopted in creating an Irish language speechdatabase and discusses two techniques 1 for designing thecorpus and determining its content. The process begins witha large amount of annotated sentences, of which a subset isselected based on criteria that would deem certain sentencesmore suitable than others for inclusion in the final corpus. Thefirst technique, as used to create the Arctic database, employsa greedy algorithm [4] to select only the most phoneticallydiverse sentences for the corpus, so that the greatest number ofcontextually dependent speech units are found in the smallestset of sentences. The second is a novel technique whicheffectively removes over-represented and therefore redundantunits from the corpus, by determining which units get selectedmore by the unit selection synthesiser, and removing all similarones that tend not to be chosen. Both methods result in asmaller subset of sentences being chosen for the corpus. Whilethe first method is anticipated to select only linguisticallydiverse sentences, the second method will contain the bestquality examples. Since the techniques will be carried outon a recorded, annotated corpus, the better method can bedetermined by comparing the coverage of each database and byevaluating the synthetic speech output from synthesisers basedon the corpora. The techniques can then be used to reducelarge amounts of data to a smaller subset before recording. Thecreation of small, efficient spoken Irish corpora will contributegreatly to the growing demand for minority language speechtechnology, in particular to the creation of Irish synthesisers.II. APPROACHCreating a corpus involves (i) selecting source material,(ii) analysing the corpus to determine unit coverage statistics,(iii) selecting the most phonetically varied sentences from thesource material, and (iv) recording a speaker, [5].A. Gathering and analysing source materialResesarch by the Phonetics and Speech Laboratory inTrinity College Dublin has resulted in the creation of thefirst Irish unit selection speech synthesiser, available to useat http://www.abair.ie. The corpus of roughly 9,000 sentencesused to create this synthetic voice draws from internet newssites and out-of-copyright works of fiction and is also used asthe source text for testing the two data reduction techniques to1 Currently being developed by Harald Berthelsen and Amelia Kelly


2create small corpus subsets. Most of the text is specific to theGaoth Dobhair dialect of Irish, but in order to create a moreadaptable speech database, a core of dialect-neutral text shouldbe used as the basis set, and corpora of other Irish dialects canbe created by adding dialect-specific texts to the basis set. Thesource material can then be analysed to get statistics on thefrequency of occurrence of linguistic units in order to comparethe linguistic unit coverage of the selected subsets.B. Sentence selection techniques1) The Greedy Algorithm: The greedy algorithm is aniterative technique that allows the creation of smaller corporaby choosing a subset of sentences from the basis set so that thelargest number of linguistic units in context are representedin the smallest number of sentences. This is achieved byfirst choosing a unit size by which to define linguistic unitcoverage. In this study, the base unit originally chosen wasthe diphone, that is the measurement from the midpoint ofone phoneme to the midpoint of the adjacent one. At thistime, the criteria for whether a sentence gets included in thesubset depends on how phonetically varied the sentence is,given by the distribution of phones, diphones and triphoneswithin the sentence. For example the word “cat” contains threephones (/k/, /ae/ and /t/), four diphones (the transition fromsilence the beginning of the word is given by /#-k/, followedby /k-ae/, /ae-t/, and the transition back to silence /t-#/) andthree triphones (/#-k-ae/, /k-ae-t/ and /ae-t-#/). The algorithmwill iterate through the sentences and choose the one withthe most number of unique linguistic units, removing it fromthe basis set to be stored as the first sentence of the corpussubset. The algorithm will repeat this until a specified numberof sentences have been collected. The smallest corpus thatachieves maximum occurrence of these features is said to bethe one with the best linguistic unit coverage.2) The Waste Disposal Method: The waste disposal methoddoes not focus on the linguistic variability within a sentence,but instead removes sentences from the corpus if the unitsin the sentence are satisfactorily represented elsewhere in thedatabase. A sentence can be deemed redundant, and thereforeremoved from the basis set, if it can be synthesised usingunits from the rest of the corpus and the synthesised versionis shown to be acoustically similar to the original recording.This can be achieved by removing a sentence from the corpus,and then attempting to synthesise that sentence using onlythe sentences that remain in the database. The synthesisedversion can be compared with the original recording usingacoustic distance measures (eg. Euclidean distance betweenMel-frequency cepstral coefficients (MFCC) vectors [6]) andperceptual tests (like those conducted for the Blizzard Challenge2 ). If they are similar, it may then be concluded that thesentence can be removed from the database without degradingthe quality of the synthesiser.C. Experiment DesignThe data reduction techniques outlined above are performedon the basis set of 9000 Irish sentences. The greedy algorithm2 The Blizzard Challenge, http://festvox.org/blizzard/265technique is implemented to create corpora that vary in sizeby increments of one hour, resulting in the creation of 8corpora. The linguistic coverage is then ascertained for eachcorpus subset in order to show how the coverage increases withincreasing corpus size. The most efficient corpus will be thesmallest sized one that has the maximum amount of coverage.For the waste disposal method, just one corpus needs to becreated that has minimum unit redundancy. The unit coveragefor this corpus can then be compared with that chosen fromthe greedy algorithm technique. Further comparison of themethods can be carried out by recording a speaker readingthe prompt sets for each corpus and creating synthesisers outof the resulting recorded speech databases. Evaluating thesynthesisers by designing perception test will provide furtherinformation as to the merit of each data reduction technique.III. CONCLUSIONThe main focus of this research is to provide speech technologyresources for the Irish language. The creation of smallfreely-available corpora will allow the creation of efficientand intelligible speech synthesisers, which are indispensablefor use as teaching and learning resources and accessibilitytools for the visually and vocally disabled. The size of thespeech database used for synthesis will determine its qualityand speed. In order to determine the most suitable databasein terms of size and content, two data reduction techniquesare described in which sentences can be selected from a largebody of data to form small corpora of maximum linguisticcoverage. Further comparisons can be made between themethods by evaluating the quality of synthetic voices basedon the corpora. Further challenges involved in distributingthe recorded speech databases are selecting and recordinga speaker for each major dialect of Irish. By gathering acore set of what can essentially be considered dialect-neutralmaterial, supplementing it with dialect-specific material, andapplying the data reduction techniques described above, wehope to create freely-available Irish corpora, in keeping withthe growing need for minority language speech technologyresources.IV. ACKNOWLEDGEMENTSThe CABÓGAÍ II project is funded by Foras na Gaeilge.REFERENCES[1] Kominek, J. and Black, A. W., “CMU ARCTIC databases for speechsynthesis”, Language Technologies Institute, Carnegie Mellon University,Pittsburgh, PA, 2003.[2] Black, A. W., Taylor, P. and Caley, R., “The Festival speech synthesissystem”, http://festvox.org/festival, 1998.[3] Ní Chasaide, A., Wogan, J., Ó Raghallaigh, B., Ní Bhriain, Á., Zoerner,E., Berthelsen, H. and Gobl, C., “Speech Technology for MinorityLanguages: the Case of Irish (Gaelic)”, Interspeech, Pittsburgh, PA,2006.[4] van Santen, J. P. H. and Buchsbaum, A. L., “Methods for Optimal TextSelection”, Eurospeech, Greece, 1997.[5] Ni, J., Hirai, T., Kawai, H., Toda, T., Tokuda, K., Tsuzaki, M., Sakai, S.,Maia, R. and Nakamura, S., “ATRECSS – ATR English Speech Corpusfor Speech Synthesis”, The Blizzard Challenge 2007 – Bonn, Germany,August 25, 2007.[6] Vepa, J., King, S. and Taylor, P., “Objective Distance Measures for SpectralDiscontinuities in Concatenative Speech Synthesis”, <strong>Proceedings</strong> ofthe IEEE workshop on Speech Synthesis, 2002.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!