SAPHYRE

Contract No. FP7-ICT-248001

Adaptive and Robust Signal Processing in Multi-User and Multi-Cellular Environments (final)

D3.1b

Contractual date: M36 Actual date: M36

Authors: ▇▇▇▇▇▇▇ ▇▇▇▇▇, ▇▇▇▇▇▇ ▇▇▇▇▇▇, ▇▇▇▇▇▇▇▇▇ ▇▇▇▇▇▇▇▇, ▇▇▇▇▇▇ ▇▇▇▇▇▇▇, ▇▇▇▇▇ ▇▇▇▇▇▇▇, Zuleita ▇▇ ▇▇▇▇ ▇▇, ▇▇▇▇▇▇ Jor- ▇▇▇▇▇▇, ▇▇▇▇▇▇▇▇▇▇▇ ▇▇▇▇▇▇▇▇▇, ▇▇▇▇ ▇. ▇▇▇▇▇▇▇, ▇▇▇▇▇▇▇ ▇▇, ▇▇▇▇▇▇▇▇ ▇▇▇▇▇▇▇▇, ▇▇▇▇▇▇▇▇▇ ▇▇▇▇▇▇▇▇▇, ▇▇▇ ▇▇▇▇▇▇, ▇▇▇ ▇▇, ▇▇▇▇▇▇ ▇▇▇▇▇

Participants: CTU, ECM, LiU, TNO, TUD, TUIL Work package: WP3

Security: Public Nature: Report Version: 1.0

Number of pages: 179

Keywords

Spectrum Sharing, Infrastructure Sharing, SAPHYRE Gain, Interference Channel, Relay.

Executive Summary

This document describes the signal processing techniques which have been devel- oped within work package (WP) 3 of the SAPHYRE project. Many fundamental aspects of physical resource source sharing in future wireless networks, including the spectrum and/or the infrastructure sharing, are investigated. First, the informa- tion exchange mechanisms between operators, which are a prerequisite for physical resource sharing, are discussed in Chapter 2. This part of the deliverable contains an overview of the information exchange requirements as well as an individual anal- ysis for three sharing scenarios. Afterwards, advanced signal processing algorithms are proposed to accomplish different forms of spectrum and infrastructure sharing majorly by exploiting MIMO techniques. To be specific,

•

In Chapter 3 interference mitigation techniques in spectrum sharing multi-cell wireless networks are studied.

– Section 3.1 considers the joint precoding across the transmitters of co- operating operators (sharing the same spectrum) with varying backhaul data symbol routing overhead. With the assumption of perfect CSI, a sum rate maximization problem with limited backhaul capacity is for- mulated. Since the conventional precoders cannot be applied directly in this setting, precoder for a given data symbol allocation pattern is de- signed. Using the developed precoding technique, an algorithm aiming to find the optimal data symbol allocation subject to a constraint on the backhaul capacity is proposed.

– Section 3.2 deals with the problem of channel estimation in the pres- ence of multi-operator interference generated by spectrum sharing, and more specifically pilot contamination. First, a Bayesian channel estima- tion method making explicit use of covariance information in the inter- operator interference scenario with pilot contamination is developed. It is shown that the channel estimation performance is a function of the degree to which dominant signal subspaces pertaining to the desired and interference channel covariance overlap with each other. Therefore, the fact that the desired user signals and interfering user signals are received at the operator side with (at least approximately) finite-rank covariance matrices is exploited. Finally, a pilot sequence assignment strategy based on carefully assigning selected groups of users to identical pilot sequences is proposed.

– Section 3.3 considers the downlink of a multicell network where neigh-

boring multi-antenna base stations share the spectrum and coordinate their frequency and spatial resource allocation strategies to improve the overall network performance. The coordinated scheduling and multiuser transmit beamforming problem is combinatorial. Thus, a mixed-integer second-order cone program is formulated and a low-complexity branch & bound algorithm that yields the optimal solution is proposed.

•

In Chapter 4 distributed MIMO signal processing and resource allocations algorithms for multi-operator spectrum sharing networks are developed and discussed.

– In Section 4.1 a distributed beamforming algorithm is proposed for the two-user multiple-input single-output (MISO) interference channel (IC). The algorithm is iterative and uses as bargaining value the interference that each transmitter generates towards the receiver of the other user. It enables cooperation among the transmitters in order to increase both users’ rates by lowering the overall interference. The algorithm is also equally applicable when the transmitters have either instantaneous or statistical channel state information (CSI). Moreover, the outcome of the proposed algorithm is approximately Pareto-optimal.

– In Section 4.2 the problem of adaptive allocation of spectrum, power, and rate is considered for the downlink of multicell orthogonal frequency- division multiple-access (OFDMA) networks. The considered networks are assumed to have frequency reuse one and discrete-level rates. The the joint allocation problem as a nonlinear mixed integer program (MIP) is formulated, which is computationally intractable to solve optimally for practical problem sizes. Thus, using the fact that the receivers can measure the perceived interference-plus-noise on every subcarrier, the original problem is decomposed into subproblems that are solved by in- dividual base stations. Each subproblem is a linear MIP and its optimal solution can be obtained by means of standard branch-and-cut solvers, at least for small to medium problem sizes. In order to further reduce the complexity, a distributed iterative algorithm that capitalizes on the subgradient method is proposed.

•

In Chapter 5 the relay-aided spectrum and infrastructure sharing scenarios are presented. Problems of optimizing the system performance with respect to the sum rate, the secrecy rate, and the energy efficiency are formulated and analyzed.

– Section 5.1 focuses on a multi-user wireless network which consists of mul- tiple one-way smart instantaneous relays (i.e., the signals of the source- destination link arrive at the same time as the source-relay-destination link) and one-way dumb relays, where the smart relays are able to gather channel state information, perform linear processing and forward sig-

nals whereas the dumb relays are only able to serve as amplifiers. The achievable rate region is studied in the scenario of a) uninformed non- cooperative source-destination nodes (source and destination nodes are not aware of the existence of the relay and are non-cooperative) and (b) informed and cooperative source-destination nodes. Furthermore, an al- gorithm which is based on interference neutralization is developed and analyzed.

– Section 5.2 studies the users’ privacy in heterogeneous dense networks, where the spectrum is shared. On a multi-antenna relay-assisted multi- carrier interference channel, each user shares the frequency and spatial resources with all other users. When the receivers are not only interested in their own signals but also in eavesdropping other users’ signals, the cross talk on the frequency and spatial channels becomes information leakage. To avoid such a leakage, a novel secrecy rate enhancing relay strategy is proposed, which utilizes both frequency and spatial resources, termed as information leakage neutralization.

– In Section 5.3 a two-operator relay sharing scenario with a one-way amplify-and-forward MIMO relay is presented. The problem of mini- mizing the transmit power at the relay subject to SINR constraints at the destination nodes is studied. Both optimal and suboptimal precoding solutions are derived.

– Section 5.4 investigates a sharing scenario where both the BS and a two-way amplify-and-forward MIMO relay are shared among multiple operators. The sum rate maximization problem for this system is non- convex and in general NP-hard. Thus, heuristic methods which are based on the channel inversion, the concept of projection based separation of multiple operators (ProBaSeMO), and zero-forcing dirty paper coding are developed and compared.

– The sharing scenario in Section 5.5 consists of multiple two-way amplify- and-forward relays where each relay has its own transmit power con- straint. The sum rate maximization problem of such a network is NP- hard. First, a global optimal solution based on the polyblock algorithm is derived. Due to its high computational complexity, this algorithm can only be used as a benchmark. Afterwards, inspired by the polynomial time difference of convex functions (POTDC) method, a sub-optimal so- lution which has lower complexity but comparable performance is devel- oped. To further reduce the computational complexity, the total SINR eigen-beamformer and an interference neutralization based design are proposed. Detailed comparison and analysis of the proposed algorithms are performed.

– Section 5.6 considers a relay sharing scenario with asymmetric relay roles,

where the sources use network coded modulation (NCM) and the re- lay applies the hierarchical decode and forward (HDF) strategy. The NCM/HDF technique is vulnerable to the mutual phase rotations of the signals from the sources. Taking the advantage of additional degrees of freedom in SIMO channels, a specific beamforming is tailored for the given NCM/HDF strategy. The resulting beamforming strategy is equal- izing the signals at the relay to optimize the processing of hierarchical information to take the full advantage of HDF.

Notation 9

1 Introduction 13

2 Information Exchange Mechanisms Between Operators 17

2.1 Introduction 17

2.2 General Analysis 19

2.3 Individual Analysis per Transmission Scheme 22

3 Interference Mitigation in Spectrum Sharing Networks 27

3.1 Partial Data Sharing in Multicell Spectrum Sharing Networks 27

3.2 A Coordinated Approach to Channel Estimation 33

3.3 Coordinated Scheduling and Beamforming for Spectrum Sharing Net- works 43

4 Distributed MIMO Signal Processing and Resource Allocation 59

4.1 Cooperative Beamforming 59

4.2 Resource Optimization in Multicell OFDMA Networks 75

5 Relay Assisted Infrastructure Sharing and Spectrum Sharing 85

5.1 Interference Neutralization 85

5.2 Information Leakage Neutralization 100

5.3 Energy Efficient Relay Sharing with One-Way Amplify-and-Forward MIMO Relays 116

5.4 Relay Broadcasting Channel with a Two-Way Amplify-and-Forward MIMO Relay 135

5.5 Resource Sharing in Two-Way Relaying Networks with Amplify-and- Forward Relays 143

5.6 Network Coded Modulation in 2-Source 2-Relay Network 154

6 Conclusions 167

Bibliography 171

Notation

Abbreviations

AF amplify and forward ANOMAX algebraic norm maximization AOA Angle of Arrival

AQM Active Queue Management

BD block diagonalization

BS base station

CB covariance based

CBS Commit Burst Size

CIR Commit Information Rate

CoZF coordinated zero-forcing

CPA Coordinated Pilot Assignment

CSI channel state information

DCM dual channel matching

DFT discrete Fourier transform

EBS Excessive Burst Size

EIR Excessive Information Rate

EVC Ethernet Virtual Circuit

FDD Frequency Division Duplexing FlexCoBF flexible coordinated beamforming IC interference channel

ICIC Inter-Cell Interference Coordination IETF Internet Engineering Task Force IRC interference relay channel

LTE Long Term Evolution

LS Least Squares

MAP Maximum a Posteriori

MIMO multiple input multiple output MISO multiple input single output MMSE minimum mean square error MR maximum ratio

MRC maximum ratio combing

NBS ▇▇▇▇ bargaining solution

NE ▇▇▇▇ equilibrium

P2P point to point

PDF Probability Density Function

PIR Peak Information Rate

PO Pareto optimal

QoS Quality of Service

RBD regularized block diagonalization

RC relay channel

RED Random Early Detection

SIMO Single Input Multiple Output

SINR signal-to-interference-plus-noise ratio

SLA Service Level Agreement

SNR signal-to-noise ratio

SP Service Provider

SR sum rate

srTCM single rate Three Color Marker SVD singular value decomposition TCP Transmission Control Protocol TDMA time division multiple access trTCM dual rate Three Color Marker

TSW2CM Time Sliding Window Two Color Marker TSW3CM Time Sliding Window Three Color Marker TWR two-way relaying

ULA Uniform Linear Array

UT user terminal

VLAN Virtual LAN

ZF zero forcing

Mathematical Notation

C set of complex numbers

R set of real numbers

Z set of integers

CN complex normal distribution

tr {·} trace of a matrix

rank{·} rank of a matrix

R{·} range (span) of a matrix

N{·} nullspace of a matrix

ΠZ orthogonal projection onto the range of Z

Π

⊥ orthogonal projection onto the orthogonal complement of the range of Z

I identity matrix

{·}

E expectation operator

XT transpose of a matrix

X∗ conjugate of a matrix

XH conjugate transpose of a matrix

ǁ·ǁF ▇▇▇▇▇▇▇▇▇ ▇▇▇▇

X ⊗ Y Kronecker product of two matrices X and Y

1 Introduction

In current wireless communication systems, the radio spectrum and the infrastruc- ture are typically used such that interference is avoided by exclusive allocation of frequency bands and employment of base stations. SAPHYRE will demonstrate how equal-priority resource sharing in wireless networks improves the spectral effi- ciency, enhances coverage, increases user satisfaction, leads to increased revenue for operators, and decreases capital and operating expenditures.

The physical resources which are shared can be divided into two classes, namely spectrum and infrastructure. These are shared with respect to a set of ‘players’, consisting of operators and users. Each player has a set of private information, e.g., operators have their business models and their revenue strategies, users have their private interests and their partly private state information including traffic, mobility, channel parameters. These goals and parameters are usually not revealed to others.

The spectrum sharing is performed with respect to a set of constraints. These constraints are divided into two areas, namely regulatory and environmental con- straints. They can partly overlap as in the case of spectrum masks and power con- straints which are both regulatory and environmental. The main difference between these two areas is that regulatory constraints contain fairness and social welfare or legal issues whereas environmental constraints contain fundamental limitations im- posed by physics.

The resource sharing problems are interdisciplinary and require regulatory and po- litical bodies, business and market experts, and technical input from communica- tion and network engineers. The ongoing discussion about spectrum commons is led mainly from a regulatory and market point of view. However, advances in communi- cation systems (e.g., multi-antenna systems, multi-carrier transmission techniques, adaptive receivers, software defined radio, interference cancellation) are recognized already to have a very strong impact since they enable the efficient and concurrent use of spectrum.

From a communications engineering point of view, different types of orthogonality in frequency, time, space of coding domain have been used for resource allocation depending on the type of interference: For users in one cell operated by one opera- tor (intracell interference) TDMA combined with FDMA (used in GSM systems) or CDMA (combined with TDMA/FDMA in 3G systems) is applied to separate their signals at the receivers. For different sectors or cells, the intercell interference is controlled by applying different frequency reuse factors [1]. Fractional and adaptive

frequency reuse is discussed in LTE and WiMAX [2]. Very recently, techniques for separating transmissions from different operators (inter-operator interference) without orthogonal resource allocation have been developed: First flexible resource sharing approaches have been developed and results indicate that the overall effi- ciency of the system can be improved by sharing different resources in the network between several operators [3, 4]. Sharing of spectrum or infrastructure ends up in creating interference on the physical layer. Therefore, interest in physical and MAC layer optimization for resource sharing has increased recently.

This report is a summary of the design and performance analysis of adaptive and robust signal processing algorithms to exploit the additional degrees of freedom in multi-user and multi-cellular environments, which are achieved in work pack- age WP3. As already pointed out in the SAPHYRE deliverable D3.1a, it is impor- tant to:

•

show the gain (loss) with respect to the chosen performance metric as com- pared to a non-sharing scenario,

•

point out conditions when a significant gain can be achieved for the chosen scenario (topology), and

• illustrate the order of magnitude of this gain.

This sharing gain, namely the SAPHYRE gain, can be defined as the performance comparison in terms of various performance metrics (e.g., the system sum-rate, the achievable rate region, etc.). Now we briefly review the two types of SAPHYRE gains which are defined in the SAPHYRE deliverable D3.1a. These definitions will be also applied in the following this report. The SAPHYRE gain is in general defined in terms of a particular performance metric (e.g., the sum rate) of the simultaneous sharing scenario compared to the time-shared use of the spectrum and infrastructure by the operators (time division case, in this case, the operators and users are multiplexed in the time domain). If the sum rate is chosen as the performance criterion, the absolute SAPHYRE gain is defined as

▇

Ξ = Σ U − 1 Σ U SU, (1.1)

k=1

and the fractional SAPHYRE gain is defined as

ΞF =

k=1

▇

1 Σ U SU

k=1

, (1.2)

∈ { · · · }

where k 1, 2, , K is the index of the users. The utility function of the kth user in the sharing scenario and the time division case are denoted by Uk and U SU,

respectively.

This report is categorized into three parts. In the first part (Chapter 2) the in- formation exchange mechanism, which is required for the algorithms developed in this work package, is discussed. In the second part (Chapters 3 and 4), transmit strategies (Sections 3.1, 3.3, and 4.1), resource allocation schemes (Section 4.2) and channel estimation algorithms (Section 3.2) are studied for multi-operator multi-cell networks. The major difference between Chapter 3 and Chapter 4 is that centralized design/coordination is required for applying the algorithms developed in Chapter 3 while a distributed implementation of the algorithms developed in Chapter 3 is possible. In the third part (Chapter 5), transmission techniques for maximizing the system sum rate (Sections 5.1, 5.4, 5.5, and 5.6), maximizing the secrecy rate (Sec- tion 5.2) or minimizing the transmit power (Section 5.3) in relay assisted physical resource sharing scenarios are presented.

2 Information Exchange Mechanisms Between Operators

2.1 Introduction

The advanced physical-layer transmission schemes, investigated by SAPHYRE WP3, intend to coordinate the radio transmission of cooperating operators net- works on the same radio resource (time, frequency), and thus to mitigate or even minimize the interference among users of different operators sharing the same ra- dio resource. For this purpose, some prior information is typically required at the transmitters, in order to adapt the transmitted radio signals of different operators in such ways that the interference among these signals (as long as they use the same radio resource simultaneously and in the vicinity of each other) is minimized.

The required prior information and their exchange (among network nodes) differ ac- cording to the investigated transmission schemes, duplexing mode (TDD or FDD), direction of the transmission (uplink or downlink), and optionally also to the con- sidered deployment scenarios (e.g. relaying-related or not, co-sited or non-co-sited base stations). In the context of this section below, unless explicitly mentioned, the downlink of FDD systems is assumed in the analysis. Similar analysis can be derived for uplink transmission and the TDD mode with some necessary modification.

Generally, these information may be categorized according to the ways how they are exchanged (among different network nodes or different operators):

•

Information exchanged over the air interface between a terminal and the base station. A typical type of such information is the channel state information (CSI) reported by the receiver to the transmitter. In reality, considering the required signaling overhead, CSI feedback is represented by a finite set of values. For example, in LTE each user terminal (UE) uses a combination of the so-called rank indicator (RI, up to 8 values depending on the number of antennas), precoding matrix indicator (PMI, up to 16 values depending on the number of antennas and the recommended RI) and channel quality indicator (CQI, 15 values) to inform the serving cell the downlink channel condition [5]. The RI provides recommendation on the MIMO transmission rank in the downlink, i.e. the number of independent data streams transmitted in parallel. The PMI indicates which of the specified precoding matrices should preferably be used for the downlink transmission. The CQI represents the maximum modulation and coding scheme (MCS) the UE is able to receive with a block-error probability of at most 10% (using the recommended RI

and PMI). These indicators are used by the base station, jointly with other information, for downlink scheduling and link adaptation. Such quantification of CSI using limited number of values (and thus limited number of signaling bits) introduces inaccuracy of CSI knowledge at the transmitter. In LTE, to further reduce the signaling overhead, these indicators are reported not per physical resource block (PRB, equal to 12 OFDM subcarriers or 180 KHz, which is the frequency unit for scheduling in LTE), but rather per group of adjacent PRBs or even the whole bandwidth. Another source of inaccuracy in CSI at the transmitter is the latency, due to the time used for measurement, signal processing, and radio transmission over the air. In LTE, for example, this latency is typically in the range of 3-4 ms. This is crucial for UEs with high mobility, since the CSI feedback received by the transmitter might be already “out-of-date” for the current transmission. Another type of information, to support retransmission (e.g. in the context of H-ARQ), may be sent by the receiver to the transmitter, indicating whether a new initial transmission or a re-transmission of a formerly transmitted data is expected.

•

Information exchanged via the mobile backhaul between base stations. Such information could include e.g. the range of spectrum shared or not shared per operator, the information used to coordinate the use of reference signals among operators, and (if applicable) CSI information (reported by the UEs of one operator to their own serving cells, but for the links with the other operator’s cells). Such information exchange also takes some time and consumes the capacity of the backhaul. For example, it is often assumed that signaling over the backhaul between LTE base stations (within the same network) has a typical average delay in the region of 10 ms, and in some extreme cases even up to the range of 20 ms [6]. Somewhat higher latency is suspected for inter-operator information exchange if considering e.g. across-operator authentication.

Another aspect to be considered is the synchronization among base stations of dif- ferent operators. Traditionally synchronization requirements differ between FDD- mode networks and TDD-mode networks: only frequency synchronization has been required for proper operation of FDD-mode mobile networks, while for TDD-mode both frequency synchronization and phase/time synchronization are required. How- ever, in order to support cooperative operation of base stations, new requirements of phase/time synchronization are emerging also for FDD-mode networks, to support certain features such as Multicast Broadcast Multimedia Services (MBSFN) and Coordinated Multi-Point transmission and reception (CoMP). Global Navigation Satellite Systems (GNSS), such as GPS, may be used to achieve synchronizations with required accuracy (e.g. refer to [7] for the required frequency and phase/time accuracy of some technologies). But, GNSS might not always be reliable (e.g. for indoor base stations) and cost-effective. An alternative way is synchronization via network protocols carried by the transport network among base stations. For example, frequency synchronization might be managed via the Ethernet Synchro-

nization Messaging Channel (ESMC) protocol as defined by ITU-T [7]. Phase/time synchronization may be managed at two different layers [8]: physical layer (for the synchronization of timing/phase offset), MAC layer (for the synchronization of data flows). The IEEE 1588 standard [9] is an example of physical-layer phase/time syn- chronization protocol. Synchronization at the MAC layer is desired for cooperative operation of base stations in a decentralized manner. For example, if the data flows are made synchronously available at each base station, joint beamforming can be implemented in a decentralized manner with each base station applying its beam- forming matrices to its own data flow [8]. The authors of [7] proposed an extended version of the ESMC protocol, in order to transport both frequency and phase/time synchronization messages.

In this chapter, we will first give an overall and general analysis of the relation between WP3 transmission schemes and information exchange and synchronization, followed by individual analysis per transmission scheme reported in this deliverable.

2.2 General Analysis

2.2.1 Impact with regard to information exchange over the air interface

In order for UEs to estimate the CSI from base stations of different operators, the reference signals (or pilot patterns) of different operators should be orthogonal to each other, at least for the base stations in the same area. Although in legacy networks (such as LTE) reference signals are not operator-specific, operator-specific reference signals are theoretically possible e.g. by applying an extra operator-specific orthogonal coding to currently cell-specific or UE-specific reference signals. In re- ality, this means that an extra step of applying operator-specific orthogonal coding should be included in the physical-layer (PHY) technical specifications of a radio technology. Another option is that the involved operators make agreement in using the same group of cell-specific reference signals, as long as no collision occurs in using the same reference signals by two cells in the vicinity of each other. This, however, deserves more management burden of the operators.

In legacy networks, a UE mainly reports the estimated CSI from the serving cell. Only when triggered by certain events, e.g. for the purpose of handover, the UE starts to measure the CSI of neighboring cells belong to the same operator (often in the formats of SINR and/or received signal strength) and if further triggered also reports the measured results to the serving cell. While for the developed WP3 transmission schemes, the UE may often need to continuously measure and report the CSI of the other reference cell(s) (of the other operator) at a certain time scale (see next section). This increases the uplink signaling overhead, and perhaps also the downlink signaling overhead in case that acknowledgment messages are required for such reporting. Note also that this signaling overhead scales with the number of antennas.

The overhead in signaling such extra CSI on the air interface is also dependent of the number of cells participating in the cooperation. In the simplest case, only single cell per operator is involved in the cooperation, and the UE only needs to report one extra set of CSI. The drawback is that, since all the other intra- and inter-operator cells in the neighborhood are not taken into account in the determination of trans- mission waveforms, the UE may suffer significant inter-cell interference especially in the case of full frequency reuse (i.e. frequency reuse factor of 1). On the other hand, if more neighbor cells are involved in the cooperation, better performance might be expected at the cost of more extra signaling overhead over the air inter- face. Such theoretical and potential performance gain, however, may be degraded in practice due to the CSI inaccuracy in the sense of e.g. channel estimation error, quantification loss, and latency.

When retransmission is involved, another issue is whether new waveforms (or more precisely pre-coding matrices, in the case of beamforming) should be generated for each retransmission, using updated CSI at the transmitter. There is often tradeoff between processing complexity (for regeneration of new waveforms) and perfor- ▇▇▇▇▇.

The sensitivity of performance to inaccurate CSI at the transmitter determines to large extent the applicability of the corresponding transmission scheme to real radio networks, since in reality the CSI is always inaccurate at the transmitter due to the limits addressed in the former section and possible radio transmission errors.

2.2.2 Impact with regard to information exchange over the backhaul

As indicated by [6], transmission over the mobile backhaul is quite reliable with negligible packet error rate. Thus, for CSI exchanged over the backhaul, we assume no extra inaccuracy is introduced. While for other information exchanged over the backhaul, we assume they are always correctly received.

Taking into account the typical latency (i.e. 10 ms) of transmission over the back- haul, it is expected that the relevant information only changes and will be exchanged at a relatively large time scale (e.g. seconds or higher). It is feasible for almost all information exchanged over the backhaul that we have in mind, e.g. the range of spectrum shared or not shared per operator (it is expected to change at time scale no shorter than hours), and the information used to coordinate the use of refer- ence signals among operators (it is often static). One possible exception asking for special concern is, if applicable, the CSI exchanged over the backhaul. Since these CSI will be used for the (re-)generation of radio waveforms, this implies that the corresponding transmission scheme cannot be too dynamic in the sense of updating radio waveforms.

However, the above limitation is relaxed if the cooperating cells are co-located. In the case that co-siting is not possible, it is also beneficial if the cooperating base

stations are connected with fiber links and only with limited number of routers in between. In the latter case, lower latency (< 1 ms) is expected than the typical value of 10 ms [10]. The latency could be even further reduced, if the concept of centralized RAN (C-RAN) as termed by NGMN [11], or “cloud-based RAN”, is used. C-RAN is a base station architecture that separates the processing part into a centralized cloud based pool of servers, at one side, and the RF part for up-conversion, power amplification and digital (usually optical) to analog radio transmission, on the other side, into a distributed set of remote radio heads (RRHs). This concept can be seen as extreme extension of the current trend for connecting a RRH on transmit tower with optical fiber (up to few tens meters long) to the processing unit in the BS container on the ground for macro- and micro- BSs. Using the concept of C-RAN, cooperating operators may pool their processing units into a centralized fashion, while the radio transmission parts of each operator are connected to the centralized processing units as RRHs. The transmission latency between the processing units and the RRHs is in the order of µs [10].

The required capacity for information exchange depends on the time scale of infor- mation exchange (how often) and the amount of information exchanged (how large). It is hard to estimate quantitatively how much the required capacity will be over the backhaul due to the lack of detailed specification of data formats. Some analysis in literature for cooperative networking, e.g. the analysis in [12] of the signaling traffic over the backhaul for support of LTE-A CoMP, calls for the necessity of deploying optical-fiber for mobile backhauling.

If in the case that the backhaul is not adequate in the terms of latency and/or capacity, an alternative solution could be that the UE reports (at least part of) the CSI to cell(s), other than its own serving cell, where the CSI will be used. However, this is not supported by legacy networks so far. One challenge is that the UE needs to set up multiple radio links with different cells, which, although not impossible theoretically, increases hardware complexity and reduces battery life of the UE significantly.

2.2.3 Impact with regard to synchronization among base stations

For synchronization (frequency and phase/time) among base stations of different operators, it is crucial that the same synchronization mechanism is adopted by the involved operators. It is especially important for synchronization with indoor base stations, since there is no GPS equipped with the latter. That’s might be managed via internationally standardized network synchronization protocols like IEEE 1588 and ESMC.

For transmission schemes for which the transmission waveforms are generated in a decentralized manner, the data flows of cooperative base stations is desired to be synchronized, e.g. using one of the approaches proposed in [7]. On the other hand, for centralized approaches, the data flows are naturally synchronous.

It is for further study what the requirements of the WP3 transmission schemes are on the synchronization accuracy.

2.3 Individual Analysis per Transmission Scheme

In this section, we analyze the impact with regard to information exchange and synchronization for three of the transmission schemes reported in this deliverable.

2.3.1 Distributed MIMO signal processing and resource allocation

This transmission scheme targets at downlink transmission. In order for the coop- erative base stations (of different operators) to jointly determine their beamforming precoding matrices, the base stations should have the knowledge of the CSI of each pair of transmitters and receivers (either desired or undesired) involved. Each UE needs to measure the reference signals of both its own serving cell and other ref- erence cells. For this purpose, the reference signals of coordinated cells should be orthogonal to each other.

As in legacy networks, we expect that each UE reports their estimated downlink CSI only to its serving cell. Thus, the CSI should be further exchanged via the backhaul, in order to be jointly used for the determination of best precoding matrices. The way how information is exchanged over the backhaul is largely dependent of whether the scheme is implemented in a centralized manner or distributed manner:

•

In a centralized manner, all the CSI collected by the cooperative base stations should be further forwarded to the network node where the processing is taken and the decision is made. That can be a separate network node or one of the base stations participating in the coordination. After making decision, the decided precoding matrices should be fed back to all the involved base stations.

•

In a distributed manner, the cooperative base stations should exchange CSI they have collected separately. The processing is taken and the decision is made at each of the base stations.

Some basic analysis has been executed with regard to the performance impact of partial CSI, see Section 4.1. The analysis shows that in some cases this partial CSI may introduce up to 5 dB performance loss (in the sense of required SNR for the same throughput), if compared with full CSI. This may imply the importance of full CSI to this transmission scheme. However, we admit that the results might have been impacted by the way how we have modeled the partial CSI (as channel correlation matrices).

In the case that the number of users sharing the same radio resource in one single cell is higher than the number of antennas, another extra parameter needs to be

determined is the transmit power per user. In a centralized manner, this output should also be fed back to the base stations.

This scheme requires synchronization between different operators’ base stations. The cooperative base stations should be synchronized to each other at the physical layer, and in the case of a distributed manner also at the MAC layer.

2.3.2 Sharing with instantaneous relaying

In this transmission scheme (see Section 5.1), the decision is made by the relay station (RS) in a centralized manner. For this purpose, for each pair of source nodes (base stations) and destination nodes (terminals) the relay station should have the CSI of three different links: the BS-to-RS link, the RS-to-UE link, and the direct BS-to-UE link. The relay station is able to estimate the CSI of the first link (downlink) or the second link (uplink), while for the other two links the CSI is either estimated by the UE (downlink) or the base station (uplink).

In order to estimate the CSI of the BS-to-RS link (downlink) and RS-to-UE link (uplink), the relay station needs to decode the received signals, and detect the position of reference signals/pilots in the data flow. The CSI estimated by the base stations and UEs could in principle be signaled to the relay node via the air interfaces, with the costs of signaling overhead and some inaccuracy. The relay station should also be able to read these signaled message from the received signals. The delay introduced by such data processing should also be taken into account in practice.

In the analysis we have made so far, we have assumed that two reference cells of different operators are involved in the coordination. In principle, it might be ex- tended to the cases of more cells. The signaling overhead and processing complexity both increase linearly with the number of cooperating cells. This, however, may not always bring obvious performance gain if the additional cells are located relatively far away from the relay station.

This scheme requires synchronization between different operators’ base stations and the relay station. The cooperative base stations and the relay station should be synchronized to each other. The synchronization between the relay station and base stations often has to rely on the radio link between the relay station and the base stations, if the relay station is not connected with backhaul or equipped with GPS (that is usually the case). In principle this could be done in a way similar to how terminals synchronize to their serving cell in radio networks: by listening to the synchronization signals in the downlink, and by emitting random access preambles/attempts in the uplink.

2.3.3 Sharing with two-way amplify-and-forward relaying

In this transmission scheme, by default, only two reference cells are involved in the coordination for sharing spectrum and the relay station. Other neighboring cells don’t join the coordination. In this case, it is assumed that the relay station esti- mates the effective CSI (in the format of channel response) between each terminal and its serving cell, and accordingly applies “amplification matrices” before it for- wards the received signals. No CSI needs to be exchanged over the air interface nor via the backhaul link between base stations.

In order for the relay station to estimate the channel response of the links to the cells and to the terminals, operator-specific reference signals should be introduced, both in downlink and uplink. In each direction, reference signals of different op- erators should be orthogonal to each other, in order for the receiver to distinguish signals from different sources and accordingly estimate the corresponding channel responses.

Some basic analysis has been executed with regard to the performance sensitivity of this scheme to the inaccuracy of channel responses estimated by the relay station, see Section 5.4. This basic analysis shows little impact of channel estimation error to the performance in the high SNR regime. However, we admit that since some ideal assumption (e.g. pilot pattern) is assumed in the analysis, more thorough analysis is required in the further study in order to have a more convincing judgment, especially taking practical limitations into account.

The information exchanged over the backhaul might include the information used to coordinate the use of reference signals among operators, the range of spectrum shared, etc. As discussed in the former section, such information is often either static or vary at large time scale.

If more cells join the cooperation and share the same relay station, the signaling overhead increases linearly with the number of cooperating cells. This may not always bring obvious performance gain if the base stations which control these additional cells are located relatively far away from the relay station.

If taking neighboring cells (other than two reference cells) of the same operators into account in the cooperation, while these additional cells don’t use the relay station,

or the direct link between cells and terminals is enabled, there will be significant change in the way of operation and the required information exchange:

•

First, the precoding matrices at the base stations should also be dynamically adapted, for which knowledge of additional CSI is required at the cells/base station. The amplification matrix design at the relay station may still follow the way of the default case (only the two reference cells are involved in the cooperation).

•

The base stations would need to have the CSI of the BS-to-RS link as well as the RS-to-UE link, instead of the above-mentioned effective CSI between the base station and the user terminal. For this purpose, the relay station needs to forward the estimated channel response on the RS-to-UE link to the base stations, consuming some radio resource.

• Base stations need to exchange CSI via the backhaul.

3 Interference Mitigation in Spectrum Sharing Networks

3.1 Partial Data Sharing in Multicell Spectrum Sharing Networks

In this section we consider the joint precoding across K distant transmitters (TXs) towards K single-antenna receivers (RXs) sharing the same spectrum. Joint pre- coding across distant TXs requires sharing of users data symbols and CSI across the cooperating TXs. Sharing of data symbols and CSI implies high backhaul and feedback overhead which increases as the number of cooperating terminals in the network increases. However, in practical networks, cooperation between TXs is lim- ited by the constraints on the backhaul network. With the assumption of perfect CSI, we address the following issues: How to design a joint precoder for a given rout- ing pattern? How to optimize the data symbol allocation subject to a constraint on the total number of symbols being routed?

3.1.1 System model

The considered network has K TXs, with TX k equipped with Nk antennas and K

single-antenna RXs. The total number of antennas at the transmit side is denoted

by NT =

ΣK

k=1

Nk. The k-th RX receives

yk = hHx + ηk (3.1)

where hH ∈ C1×NT represents the channel vector corresponding to the k-th RX,

∈

x CNT ×1 represents the combined transmit signals of all users sent by all the trans-

∼

mit antennas and ηk CN(0, σ2) represents the i.i.d. complex circular-symmetric additive Gaussian noise at the k-th RX. The whole multiuser channel matrix of the system is

H = [h1, h2, . . . , hK]H . (3.2)

{ }

The entries of the channel matrix H read as H ij = γijGij where Gij is i.i.d. CN(0, 1) to model the Rayleigh fading and γij is a positive real number modeling the long term attenuation. It is assumed that all the TXs have the knowledge of the CSI of the whole multiuser channel. Multi-transmitter cooperative processing in the form of joint linear precoding is adopted. Thus, the transmitted signal x is

obtained from

 s1 

x = Ts = [t1

· · · tK

] .





(3.3)

∈

where s CK×1 is the vector of transmit symbols (whose entries are independent

∈

CN(0, 1)), T is the precoding matrix and tk CNT ×1 is the beamforming vector of

the symbol k.

Noting that in a statistically symmetric isotropic network, fulfilling a sum power constraint will lead to an equal average power used per TX, we consider for simplicity

a sum power constraint ǁTǁ2

= K P . We also assume that all data streams are

allocated with an equal amount of power so that ∀k, ǁtkǁ

= P .

Conventional zero-forcing (ZF) schemes result in a complete removal of interference at the receivers. This is optimal at high SNR but not necessarily at intermediate SNR. To further improve the performance of the precoder, regularized ZF precoder, which achieves good performance even at intermediate SNR is used and is given by

√

▇▇ ▇ . H Σ−1

T = H HH + αI (3.4)

HH (HHH + αI)−1

where α = K × (σ2/P ) is the regularization constant. The sum-rate of the system is equal to

K K

R = Σ Rk = Σ log2 (1 + SINRk) . (3.5)

k=1

where the SINR of the k-th data stream is given by

H 2

SINRk

= |hk tk|

. (3.6)

σ2 +

j=Σ1,j=ƒ

|hHtj|2

▇

Furthermore, the multiplexing gain (MG) of the system is defined as

MG ¾ MGk k=1

Σ R (P )

¾ lim k . (3.7)

k=1 P →∞ log2(P )

Backhaul Data Symbol Routing

To represent the effect of the allocation of the users’ data symbols to the TXs, we introduce the concept of routing matrix which specifies to which TXs the symbol of a given user is being routed. In realistic settings, e.g. cellular networks, we

∈

are concerned with the sharing of the users’ data symbols to the TXs and not to each antenna individually, as the different antennas at a given TX are collocated and perfectly cooperate. To model the data symbol sharing, routing matrix D

K×K

{0, 1} is defined as the matrix whose element {D}i,j ∈ {0, 1} is 1 if symbol sj

is allocated to TX i and 0 otherwise. The number of users’ data symbols shared

∈

ǁ ǁ

in the backhaul network can be seen to be equal to d = D 2 . Yet, the antenna configuration has not been taken into account in the routing matrix. Therefore, more notations are needed to represent it. Thus, the expansion matrix E CNT ×K

is defined as:

E ¾ ΣAT · · · AT ΣT (3.8)

▇

where the matrix Ak ∈ CNk ×K is defined as Ak ¾ 1[N ×1]eT. The vector ek ∈ CK×1

is the k-th vector from the canonical basis. Now the matrix D˜ ¾ ED can be defined

and represents well the data sharing constraints, as will be shown in Section 3.1.2.

Optimization Problem

In order to optimize the backhaul routing directly, the following sum-rate maxi- mization problem is formulated:

maximize

D∈{0,1}K×▇

▇

k=1

∗

(3.9)

subject to d = ǁDǁ2

≤ d .

d∗

. Σ

d∗

. Σ

where d∗ is the constraint on the backhaul routing overhead. Above problem is a discrete combinatorial optimization problem and generally exhaustive search is required to find the optimal data symbol allocation. For exhaustive search, a total of K2 data allocation combinations need to be searched over, i.e the complexity grows as K2 . This is prohibitive even for small number of cooperating nodes. Therefore, a greedy algorithm having low complexity is proposed.

3.1.2 Precoder optimization

In this section we consider the design of the precoder T given the routing matrix

D. Note that each beamforming vector can be derived independently due to the ZF constraints. If one TX does not receive one symbol, it cannot participate into the transmission of that symbol and the coefficient used for that beamformer at that

TX is then 0. Therefore the precoder with constrained backhaul overhead will be of the form T ⊙ D˜ , where ⊙ is the element wise product. The beamforming vector tk transmitting the symbol k is obtained from the following optimization:

minimize H tk

⊙ D˜

:,k

Σ − φ:,k 2

(3.10)

where φ = HT and the precoder T is obtained using (3.4) with full routing.

In optimization (3.9) we consider no predefined constraint on the routing pattern’s structure. Therefore, there may be some columns of the routing matrix containing only zero i.e. a user is not served. This is some kind of user selection and accounts for a positive MG with partial data sharing. Thus, a non-active user is defined as a user whose data symbol is not routed to any TX, i.e., D:,k = 0K×1, or the power allocated to the user pk = 0. If there are non-active users in the system, the precoding scheme needs to be modified so that interference is removed only at active users. We start by introducing some notations and we denote the set of

indices such that tk ⊙ D˜ :,k 0 by J and the reduced vector tk (J ) ∈ Cn2×1 with

| | ∈

| |

n2 = J made of the elements of J . Further more, the set of indices corresponding to the active-users is denoted by A, with n1 = A . The matrix H (A, J ) Cn1×n2 is used to represent the channel containing only the rows and columns consisting in A and J respectively. Finally, φ˜ = φ (A, A) represents a sub matrix of φ formed

by keeping the rows and columns of the active-users. The beamforming vector tk

can now be obtained from the following optimization:

▇

2 (3.11)

˜ minimize H (A, J ) t (J ) − φ

tk(J )

which can be solved as a conventional Least Square problem. The beamforming vector of user k is then obtained by reinserting the coefficients obtained in tk at the positions corresponding to J in tk.

3.1.3 Multiplexing gain analysis

In this section we consider the fundamental limit behind the optimization problem (3.9) and look for the maximum achievable MG when there is a constraint on the number of data symbols allocated. In the following analysis we assume that there will always be enough users in the system so that the MG is not restricted by the total degrees of freedom available at the RXs. Mathematically, if γ is the maximum MG, then K ≥ γ.

Theorem 3.1. In order to achieve the maximum MG γ under the constraint

2 ∗

ǁDǁF ≤ d , the following conditions needs to be satisfied

KTX

Nk ≥ γ ( ZF Feasibility ) ,

k=1

γ × KTX ≤ d∗ ( Sharing Constraint ) .

(3.12)

Proof. Proof can be found in [13].

Corollary 3.1. The maximum MG γ achieved with all TXs having N antennas

under the constraint ǁDǁ2

≤ d is given by

∗

 ,√ ,  d∗



γ = max N



d∗/N

,√d∗/N ,

. (3.13)

Proof. The proof can be found in [13].

3.1.4 Greedy data symbol allocation Greedy Algorithms

We consider the decreasing greedy algorithm (DEC), in which the routing matrix is initialized with D = 1K×K and at each iteration the element of D which causes the least degradation in the sum-rate is set to 0 (i.e. removing the routing link of one symbol to one of the TX) . This process is continued till the constraint on the number of backhaul links is reached.

Algorithm 1 Decreasing greedy algorithm

Input: H, d∗, Output: D, p Require: D = 1N×K, p = P × 1K×1 1: for d = 1 to K2 − d∗ do

′

2: Cinit = 0, Dtemp = D, p = p

3: for RX k = 1 to K do

4: for TX l = 1 to K do

5: if {Dtemp}lk =ƒ 0 then

6: {Dtemp}lk = 0

7: T = precoding (Dtemp, H, p) %(From Sec.3.1.2)

8: Csum= sumrate(T, H) %using (3.5)

9: if Csum ≥ Cinit then

10: m = l, n = k, Cinit = Csum

11: end if

12: Dtemp = D

13: end if

14: end for

15: end for

16: {D}mn = 0, T = precoding (D, H, p)

17: Csum= sumrate(T, H) %using (3.5)

′

18: p =Power allocation(p, T, Csum) %(C.f 3.1.4)

19: end for

The greedy algorithm at any given step try to maximize the performance at each step and it does not necessarily achieve the maximum MG. By not being MG

optimal does not hurt the performance at low SNR but in intermediate and high SNR it has a considerable effect on the performance of the algorithm. Therefore, some improvements to the greedy algorithm is proposed based on the analysis done in Section 3.1.3.

Binary Power Control In Binary Power Control (BPC) power allocated to k-th user pk takes only two values 0 or P [14]. In this work we use the idea of BPC in Algorithm 1 in step 18 to make it MG optimal. After each step of removing a data symbol from D we check whether by turning off a user completely, results in an increase in the sum-rate. If the sum-rate increases by turning off k-th user completely, then the power allocated to that user pk = 0.

Proposition 3.1. With single antenna TXs, the DEC with BPC as described in Algorithm 1 achieves the optimal MG.

Proof. The proof can be found in [13].

3.1.5 Simulations

We investigate the performance of a network consisting of K = 7 cells. Each cell consists of one TX located at the center of the cell. In the simulations, an average cell edge SNR of 20 dB is maintained by selecting the proper transmit power and the cell radius. The simulation results are averaged over uniform randomly generated RX positions (such that each cell has exactly one RX) and Rayleigh fading realizations. In Figure 3.1 the performance of the greedy algorithm with multiple antennas at the TXs with [N1, N2, ..., N7] =[2 1 1 3 1 2 1] is considered.

The greedy algorithm outperforms the conventional time-sharing scheme when the data symbol sharing overhead is more than 10% of the full cooperation scenario. We define the SAPHYRE gain to be the ratio of sum rate performances between the pro- posed algorithm and a time-sharing scheme without any cooperation among trans- mitters. For illustration purposes, we can see from Figure 3.1 that the SAPHYRE gain is 2.4 when the backhaul overhead is 50% of the full cooperation scenario. As the allowed backhaul overhead increases the SAPHYRE gain also increases.

3.1.6 Conclusion

The performance of joint precoding across cooperating TXs under varying back- haul data symbol routing overhead is studied. Simulation results shows that the proposed routing algorithm out performs the the conventional time-sharing schemes for practical backhaul overhead values.

SAPHYRE

Gain

Decreasing Greedy

Time−sharing (No cooperation)

Average Rate per User [bits/s/Hz]

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Fraction of Full Data Symbol Sharing: d*/K2

Figure 3.1: Sum-rate in terms of fraction of full data symbol sharing for a average cell edge SNR of 20dB and the antennas corresponding to the TXs are distributed as [2 1 1 3 1 2 1].

3.2 A Coordinated Approach to Channel Estimation

In this section, we address the problem of channel estimation in the presence of multi-operator interference generated by spectrum sharing, and more specifically pilot contamination. We propose an estimation method which provides a substan- tial improvement in performance. It relies on two key ideas. The first is the ex- ploitation of dormant side-information lying in the second-order statistics of the user channels, both for desired and interfering users. In particular, we demonstrate a powerful result indicating that the exploitation of covariance information under certain subspace conditions on the covariance matrices can lead to a complete re- moval of pilot contamination effects in the large number of antennas limit. We then turn to a practical algorithm design where this concept is exploited. The key idea behind the new algorithm is the use of a covariance-aware pilot assignment strategy among spectrum sharing operators within the channel estimation phase itself.

More specifically, our contributions are the following: We first develop a Bayesian channel estimation method making explicit use of covariance information in the inter-operator interference scenario with pilot contamination. We show that the channel estimation performance is a function of the degree to which dominant signal subspaces pertaining to the desired and interference channel covariance overlap with each other. Therefore we exploit the fact that the desired user signals and interfering

user signals are received at the operator side with (at least approximately) finite- rank covariance matrices. This is typically the case in realistic scenarios due to the limited angle spread followed by incoming paths originating from street-level users [15]. Finally, we propose a pilot sequence assignment strategy based on carefully assigning selected groups of users to identical pilot sequences.

The gains are shown to depend on system parameters such as the typical angle spread measured at the service provider (SP) and the number of SP antennas. Performance close to the interference-free channel estimation scenario is obtained for moderate numbers of antennas and users.

3.2.1 Signal and channel models

−

We consider a network of L time-synchronized1 spectrum sharing operators (for ease of exposition one can set L=2), with full spectrum reuse. Estimation of (block- fading) channels in the uplink is considered,2 and all the SPs are equipped with M antennas each. To simplify the notations, we assume the 1st SP is the target SP, unless otherwise notified. We assume the pilots, of length τ , used by single- antenna users in the same SP area are mutually orthogonal. As a result, intra-cell interference is negligible in the channel estimation phase. However, non-orthogonal (possibly identical) pilots are reused from SP area to SP area, resulting in pilot contamination from L 1 interfering SPs. For ease of exposition, we consider the case where a single user per operator transmits its pilot sequence to its serving SP. The pilot sequence used in the l-th SP is denoted by:

| | · · · | |

sl = [ sl1 sl2 · · · slτ ]T . (3.14) The powers of pilot sequences are assumed equal such that sl1 2 + + slτ 2 =

τ, l = 1, 2, . . . , L.

The channel vector between the l-th SP area’s user and the target SP is hl. Thus, h1 is the desired channel while hl, l > 1 are interference channels. All channel vectors are assumed to be M 1 complex Gaussian, undergoing correlation due to the finite multipath angle spread at the SP side [16]:

hl = R1/2hW , l = 1, 2, . . . , L, (3.15)

l l

where hWl ∼ CN(0, IM ) is the spatially white M × 1 SIMO channel, and CN(0, IM ) denotes zero-mean complex Gaussian distribution with covariance matrix IM . In

1Note that assuming synchronization between uplink pilots provides a worst case scenario from a pilot contamination point of view, since any lack of synchronization will tend to statistically decorrelate the pilots.

2Similar ideas would be applicable for downlink channel estimation, provided the UT is equipped

with multiple antennas as well, in which case the estimation would help resolve interferences originating from neighboring SPs.

this scenario, we make the assumption that the covariance matrix Rl ¾ E{hlhH}

can be obtained separately from the desired and interference channels.

During the pilot phase, the M × τ signal received at the target SP is

ΣY = h s + N, (3.16)l

l=1

∈

where N CM×τ is the spatially and temporally white additive Gaussian noise (AWGN) with zero-mean and element-wise variance σ2.

3.2.2 Covariance-based channel estimation Pilot Contamination due to Spectrum Sharing

Conventional channel estimation relies on correlating the received signal with the known pilot sequence (referred here as Least Squares (LS) estimate for example). Hence, using the model in (3.16), a LS estimator for the desired channel h1 is

h^LS = Ys1

∗(s1T

s1∗

−1

) . (3.17)

· · ·

The conventional estimator suffers from a lack of orthogonality between the desired and interfering pilots, an effect known as pilot contamination [17], [18], [19]. In particular, when the same pilot sequence is reused in all L SPs, i.e., s1 = = sL = s, the estimator can be written as

h^LS = h1 +

hl + Ns∗/τ . (3.18)

l 1

As it appears in (3.18), the interfering channels leak directly into the desired chan- nel estimate. The estimation performance is then limited by the signal to interfer- ence ratio at the SP side, which in turns limits the ability to design an effective interference-avoiding beamforming solution.

Bayesian Estimation

We hereby propose an improved channel estimator with the aim of reducing the pi- lot contamination effect, and taking advantage of the multiple antenna dimensions. We suggest to do so by exploiting side information lying in the second order statis- tics of the channel vectors. The role of covariance matrices is to capture structural information related to the distribution (mainly mean and spread) of the multipath angles of arrival at the SP. Due to the typically elevated position of the operator, rays impinge on the antennas with a finite angle-of-arrival (AOA) spread and a user

location-dependent mean angle. Note that covariance-aided channel estimation it- self is not a novel idea, e.g., in [20]. In [21], the authors focus on optimal design of pilot sequences and they exploit the covariance matrices of desired channels and colored interference. The optimal training sequences were developed with adapta- tion to the statistics of disturbance. In our scenario, however, the pilot design is shown not having an impact on interference reduction, since fully aligned pilots are transmitted. Instead, we focus on i) studying the limiting behavior of covariance- based estimates in the presence of interference and large-scale antenna arrays, and

ii) how to shape covariance information for the full benefit of channel estimation quality.

Two Bayesian channel estimators can be formed. In the first, all channels are estimated at the target SP (including interfering ones). In the second, only h1 is estimated.

By vectorizing the received signal and noise, our model (3.16) can be represented as

y = S˜h + n, (3.19)

∈

where y = vec(Y), n = vec(N), and h CLM×1 is obtained by stacking all L

channels into a vector.

The pilot matrix S˜ is defined as

S˜ ¾ Σ s1 ⊗ IM · · · sL ⊗ IM Σ . (3.20)

By applying Bayes’ rule and by using the maximum a posteriori (MAP) decision rule, one can obtain the following expression for the Bayesian estimator [22]:

h^ = (σ2ILM + RS˜H S˜)−1RS˜H y. (3.21)

Interestingly, the Bayesian estimate as shown in (3.21) coincides with the minimum mean square error (MMSE) estimate, which has the form

^h = RS (SRS + σ I ) y. (3.22)τM

MMSE ˜H ˜ ˜H 2 −1

(3.21) and (3.22) are equivalent thanks to the matrix inversion identity (I + AB)−1A = A(I + BA)−1.

Channel Estimation with Full Pilot Reuse

Matched filters require the knowledge of the desired channel only, so that interfer- ence channels can be considered as nuisance parameters. For this case, the single user channel estimation shown below can be used. For ease of exposition, the worst case situation with a unique pilot sequence reused in all L cells is considered:

s = [ s1 s2 · · · sτ ]T . (3.23)

⊗

Similar to (3.20), we define a training matrix S¯ ¾ s IM . Note that S¯H S¯ = τ IM . Then the vectorized received training signal at the target operator can be expressed as

y = S¯ hl + n. (3.24)

l=1

Since the Bayesian estimator and the MMSE estimator are identical, we omit the derivation and simply give the expression of this estimator for the desired channel h1 only:

h^1 = R1S¯H

.S¯

.Σ

l=1

RlΣ

S¯H + σ2IτM

Σ−1

y (3.25)

= R1

.σ2IM + τ

Σl=1

−1

Rl S¯H y. (3.26)

Note that the MMSE channel estimation in the presence of identical pilots is also undertaken in other works such as [23].

^E{ǁh − h ǁ }.1 1

Focusing on the mean squared error (MSE) of the proposed estimators, this can be defined as: M ¾ E{ǁh^ − hǁ2 }, or for the single user channel estimate M1 ¾

The estimation MSE of (3.21) is

M = tr .R

.ILM +

S˜H S˜

Σ−1Σ

R

. (3.27)

Specifically, when identical pilots are used within all SP areas, the MSEs are

M = tr

.R.

ILM +

τ JLL IM

⊗

Σ−1Σ

R

, (3.28)

M1 = tr



R1 − R2

.σ

n IM +

Σl=1

−1



Σ 

Rl , (3.29)



where JLL is an L × L unit matrix consisting of all 1s.

We can readily obtain the channel estimate of (3.26) in an interference-free scenario, by setting interference terms to zero:

h^no int = R .σ2 I

+ τ R Σ−1S¯H (S¯h

+ n), (3.30)

where the superscript no int refers to the "no interference case", and the correspond- ing MSE:

Mno int = tr

.R1

.IM +

2 R1

Σ−1Σ

. (3.31)

Large Scale Analysis

We seek to analyze the performance for the above estimators in the regime of large antenna number M . For tractability, our analysis is based on the assumption of uniform linear array (ULA) with supercritical antenna spacing (i.e., less than or equal to half wavelength).

Hence we have the following multipath model3

Σ 1

hi = √

p=1

a(θip)αip, (3.32)

where P is the arbitrary number of i.i.d. paths, αip ∼ CN(0, δ2) is independent over channel index i and path index p, where δi is the i-th channel’s average attenuation. a(θ) is the steering vector, as shown in [24]

a(θ) ¾





e−j2π D

cos(θ)





, (3.33)

e−j2π (M−1)D cos(θ)



where D is the antenna spacing at the SP and λ is the signal wavelength, such that D ≤ λ/2. θip ∈ [0, π] is a random AOA. Note that we can limit angles to [0, π] because any θ ∈ [−π, 0] can be replaced by −θ giving the same steering vector.

Below, we momentarily assume that the selected users exhibit multipath AOAs that do not overlap with the AOAs for the desired user, i.e., the AOA spread and user locations are such that multipath for the desired user are confined to a region of space where interfering paths are very unlikely to exist. Our main result is as follows:

Theorem 3.2. Assume the multipath angle of arrival θ yielding channel hj, j = 1, . . . , L, in (3.32), is distributed according to an arbitrary density pj(θ) with

bounded support, i.e., pj(θ) = 0 for θ ∈/

[θmin, θmax] for some fixed θmin

™ θmax ∈

3Note that the Gaussian model (3.15) can well approximate the multipath model (3.32) as long as there are enough paths. Since the number of elementary paths is typically very large, we have P ≫ 1 this assumption is valid in practice.

[0, π] . If the L − 1 intervals [θmin, θmax] , i = 2, . . . , L are strictly non-overlapping

with the desired channel’s AOA interval4 [θmin, θmax], we have

1 1

lim

M →∞

h^1 = h^no int. (3.34)

Proof. From the channel model (3.32), we get

δ2

R = i

Σ E{a(θ

p=1

)a(θ

)H} = δ2E{a(θ )a(θ )H},

where θi has the PDF pi(θ) for all i = 1, . . . , L. The proof of Theorem 3.2 relies on three intermediate lemmas which exploit the eigenstructures of the covariance matrices. The proofs of the lemmas are provided in [22]. The essential ingredient is to exhibit an asymptotic (at large M ) orthonormal vector basis for Ri constructed from steering vectors at regularly sampled spatial frequencies.

Lemma 3.1. Define α(x) ¾ [ 1 e−jπx · · · e−jπ(M−1)x ]T and A ¾ span{α(x), x ∈

[−1, 1]}. Given b1, b2 ∈ [−1, 1] and b1 < b2, define B ¾ span{α(x), x ∈ [b1, b2]},

then

• dim{A} = M

• dim{B} ∼ (b2 − b1)M /2 when M grows large. Proof. The proof is given in [22].

Lemma 3.1 characterizes the number of dimensions a linear space has, which is spanned by α(x), in which x plays the role of spatial frequency.

Lemma 3.2. With a bounded support of AOAs, the rank of channel covariance matrix Ri satisfies:

rank(Ri) ™ d , as M → ∞,

M i

where di is defined as

d ¾ .cos(θmin) − cos(θmax)Σ .

Proof. The proof is given in [22].

−

Lemma 3.2 indicates that for large M , there exists a null space null(Ri) of dimension (1 di)M . Interestingly, related eigenstructure properties of the covariance matrices were independently derived in [25] for the purpose of reducing the overhead of downlink channel estimation and CSI feedback in massive MIMO for FDD systems.

4This condition is just one example of practical scenario leading to non-overlapping signal sub- spaces between the desired and the interference covariances, however, more general multipath scenarios could be used.

Lemma 3.3. The null space null(Ri) includes a certain set of unit-norm vectors:

null(Ri) ⊃ span

√M , ∀Φ ∈/ [θi , θi ]

, as M → ∞.

.a (Φ)

min

max Σ

Proof. The proof is given in [22].

This lemma indicates that multipath components with AOA outside the AOA region for a given user will tend to fall in the null space of its covariance matrix in the large-number-of-antennas case.

3.2.3 Coordinated pilot assignment

{ } ∈

In this section, we exploit the aforementioned results in order to design a suitable coordination protocol for assigning pilot sequences to users in the L SPs. The role of the coordination is to optimize the use of covariance matrices in an effort to try and satisfy the non-overlapping AOA constraint of Theorem 3.2. We assume that in all L SPs, the considered pilot sequence will be assigned to one (out of K) user in each of the L SPs. Let G ¾ 1, . . . , K , then Kl G denotes the index of the user in the l-th SP area who is assigned the pilot sequence s. The set of selected users is denoted by U in what follows.

We use the estimation MSE in (3.29) as a performance metric to be minimized in order to find the best user set. (3.28) is an alternative if we take the estimates of interfering channels into consideration. For a given user set U, we define a network utility function

Σ M U

|U|

F(U) ¾ j( ) , (3.35)

j=1 tr {Rjj(U)}

| |

where U is the cardinal number of the set U. Mj(U) is the estimation MSE for the desired channel at the j-th SP, with a notation readily extended from M1 in (3.29), where this time spectrum sharing operator j is the target one when computing Mj. Rjj(U) is the covariance matrix of the desired channel at the j-th spectrum sharing operator.

The principle of the coordinated pilot assignment consists in exploiting covariance information at all SP areas (a total of KL2 covariance matrices) in order to minimize the sum MSE metric. In view of minimizing the search complexity, a classical greedy approach is proposed:

1) Initialize U = ∅

2) For l = 1, . . . , L do:

∪ { }

Kl = arg min F(U k )

k∈G

U ← U ∪ {Kl}

3) End

The coordination can be interpreted as follows: To minimize the estimation error, a SP tends to assign a given pilot to the user whose spatial feature has most differences with the interfering users assigned the same pilot. Clearly, the performance will improve with the number of users, as it becomes more likely to find users with discriminable second-order statistics.

3.2.4 Numerical results and conclusions

In order to preserve fairness between users and avoid having high-SNR users being systematically assigned the considered pilot, we consider a symmetric multi-operator network where the users are all distributed on the SP area edge and have the same distance from their SPs. In practice, users with greater average SNR levels (but equal across operator areas) can be assigned together on a separate pilot pool. Some basic simulation parameters are given in Table 3.1. We keep these parameters in the following simulations, unless otherwise stated.

Table 3.1: Basic simulation parameters

Operator area radius	1 km
Operator area edge SNR	20 dB
Number of users per operator area	10
Distance from a user to its operator	800 m
Path loss exponent	3
Carrier frequency	2 GHz
Antenna spacing	λ/2
Number of paths	50
Pilot length	10

The channel vector between the u-th user in the l-th SP area and the target SP is

Σ 1

hlu = √

p=1

a(θlup)αlup, (3.36)

∀

where θlup and αlup are the AOA and the attenuation of the p-th path between the u-th user in the l-th SP area and the target SP respectively. Note that the variance of αlup, p is δ2 , which includes the distance-based path loss βlu between the user and the target operator (which can be anyone of the L operators):

βlu = γ , (3.37)

where α is a constant dependent on the prescribed average SNR at SP area edge.

dlu is the geographical distance. γ is the path-loss exponent.

^of the j-th SP is w = h .jj

A bounded (uniform) AOA distribution is considered in this case. Moreover, two performance metrics are used to evaluate the proposed channel estimation scheme. The first one is a normalized channel estimation error whereas the second perfor- ▇▇▇▇▇ metric is the per-operator rate of the downlink obtained assuming standard MRC beamformer based on the channel estimates. The beamforming weight vector

MRC

Numerical results of the proposed channel estimation scheme are now shown. In the figures, "LS" stands for conventional LS channel estimation. "CB" denotes the Covariance-aided Bayesian estimation (without coordinated pilot assignment), and "CPA" is the proposed Coordinated Pilot Assignment-based Bayesian estimation.

−10

Conventional LS Estimation

Covariance−aided ▇▇▇▇▇▇▇▇ (CB) Estimation LS − Interference Free Scenario

CB − Interference Free Scenario

−15

Estimation Error [dB]

−20

−25

−30

−35

−40

0 20 40 60 80 100 120 140 160 180 200

Antenna Number

Figure 3.2: Estimation MSE vs. SP antenna number, 2-SP network, fixed positions of two users, uniformly distributed AOAs with θ∆ = 20 degrees, non- overlapping multipath.

In Figure 3.2 we validate Theorem 3.2 with a 2-cell network, where the two users’ positions are fixed. AOAs of desired channels are uniformly distributed with a mean of 90 degrees, and the angle spreads of all channels are 20 degrees, yielding no overlap between desired and interfering multipaths. The pilot contamination is quickly eliminated as the number of antennas at the SP increases.

In Figure 3.3 the per-SP sum rate is depicted as a function of the number of anten- nas at the SP, regarding a two-operator network and for two different cases, CPA (sharing case) and the Bayesian estimator (non-sharing case). Here, the pilot se- quences are statistically correlated and the AOAs are uniformly distributed. It is evident that the sharing gain (SAPHYRE gain) increases when the number of the

Figure 3.3: Per-operator sum-rate vs. antenna number, 2-operator network, sta- tistically correlated pilot sequences, uniformly distributed AOAs with θ∆ = 10 degrees - Bayesian estimator.

antennas at the SP gets larger.

As a conclusion, this work proposes a covariance-aided channel estimation frame- work in the context of interference-limited, multiple spectrum sharing operator, multiple antenna systems. In our scenario, we assumed the individual covariance matrices can be estimated separately. This could be done in practice by exploiting resource blocks where the desired user and interference users are known to be as- signed at different times. In future networks, one may imagine a specific training design for learning second-order statistics.

3.3 Coordinated Scheduling and Beamforming for Spectrum Sharing Networks

In this section, we consider the downlink of a multicell network where neighboring multi-antenna base stations share the spectrum and coordinate their frequency and spatial resource allocation strategies to improve the overall network performance. The objective of the coordination is to maximize the number of users that can be scheduled, meeting their quality-of-service requirements with the minimum total

transmit power. The coordinated scheduling and multiuser transmit beamforming problem is combinatorial; we formulate it as a mixed-integer second-order cone pro- gram and propose a branch & bound algorithm that yields the optimal solution with relatively low-complexity. The algorithm can be used to motivate or benchmark ap- proximation methods and to numerically evaluate the gains due to spectrum sharing and coordination.

3.3.1 System model

We consider a wireless network with L cells, where in each cell there is one base station (BS) and K mobile stations (MSs). The BS of the lth (l ∈ L ¾ {1, · · · , L}) cell is denoted as BSl and the kth (k ∈ K ¾ {1, · · · , K}) MS in the lth cell is denoted as MSl,k. Each BS has Nt antennas and each MS has a single antenna.

∈ { · · · }

Multiuser downlink beamforming is employed at each BS. The number of available subchannels is N (indexed by n N ¾ 1, , N ), which is assumed to be smaller than the total number of MSs in the network, i.e., N < LK. We assume that each MS can be scheduled in at most one subchannel. All the channel state information, i.e., from each BS to every MS in the network, is assumed to be known, and the channels are flat in each transmission interval.

l,k

We denote the beamforming vector for MSl,k in the nth subchannel as wn ∈ CNt ,

j,l,k

l,k

and the channel from BSj to MSl,k in the nth subchannel as hn ∈ CNt . We have

w

n l,k

ƒ= 0 if MSl,k is scheduled in the nth subchannel, and wn

= 0 otherwise. The

signal-to-interference-plus-noise ratio (SINR) for MSl,k in the nth subchannel can

then be expressed as

n H n 2

l,b

l,l,k

n |(wl,k)

hl,l,k|

Γl,k =

Σ |(wn )H hn

|2 + Σ Σ |(wn )H hn

j,b

j,l,k

l,k

|2 + σ2

, (3.38)

l,k

where σ2

is the noise power.

In order to satisfy the quality-of-service (QoS) requirement, the condition to sched-

l,k

▇▇▇ MSl,k in the nth subchannel is that the SINR Γn

should be no less than a target

l,k

γl,k, i.e., Γn ≥ γl,k. Here, a single SINR target is chosen for different subchannels.

Thus a MS can be scheduled in the subchannel which has the best channel and in-

terference condition. Moreover, we assume the total transmit power at each BS can

not exceed a maximum budget Pl, i.e., Σ

n∈N

Σk∈K

ǁwl,kǁ ≤ Pl. This constraint

n 2

allows the power to be distributed unevenly among the subchannels.

3.3.2 Optimization objectives: scheduling and beamforming

In our problem of interest, we have two objectives, namely scheduling and beam- forming. Scheduling is the primary objective, which is to maximize the number of MSs that can be scheduled in the available subchannels while satisfying both the

SINR constraints and the BS power constraints. Beamforming is the secondary ob- jective, which is to find the optimal beamforming vectors for those scheduled MSs that minimize the total transmit power.

We can describe this multi-objective optimization problem in two stages as in [26]. Let Sn ⊆ K denote the subset of MSs in the lth cell that are scheduled in the nth

subchannel, and |Sn| denote the cardinality of Sn. The first stage is a combinatorial

l l

optimization problem given by

max

Sn,wn

Σ Σ |Sn| (3.39a)

l l,k

n∈N l∈L

l,k

s.t. Σ Σ ǁwn ǁ2 ≤ Pl, ∀l ∈ L, (3.39b)

n∈N k∈Sn

n l,k

≥ γl,k, ∀n ∈ N , ∀k ∈ Sl , ∀l ∈ L, (3.39c)

Sn ∩ Sm = ∅, n m, ∀n, m ∈ N , ∀l ∈ L, (3.39d)

where constraint (3.39d) ensures that no MS can be scheduled in more than one subchannel.

{ }

Σ Σ Σ ǁw ǁ , (3.40a)

With the optimal sets Sn obtained by solving (3.39), the second stage is a down- link beamforming problem,

min

l,k

n 2

l,k

n∈N l∈L k∈Sn

s.t. (3.39b), (3.39c). (3.40b)

Problem (3.40) can be transformed into a SOCP problem [27] with complexity

n 3.5

O(((Σn Σl |Sl |)Nt) ) and solved efficiently by using general-purpose convex op-

timization toolboxes.

As an alternative to the two-stage formulation (3.39) and (3.40), we propose a joint

l,k

formulation by introducing auxiliary binary variables sn ∈ {0, 1}. Let sl,k = 1

l,k

if MSl,k is scheduled in the nth subchannel; and sn = 0 otherwise. The joint

formulation is given by

min

l,k

ǫ Σ Σ Σ ǁwn ǁ2 − Σ Σ Σ sn

n∈N l∈L k∈K

(3.41a)

l,k

wn , sn

l,k

s.t. Σ Σ ǁwn ǁ2 ≤ Pl, ∀l ∈ L, (3.41b)

n∈N k∈K

n H n 2 n n

|(wl,k)

hl,l,k|

+ Ml,k(1 − sl,k)

l,b

l,l,k

Σ |(wn )H hn

bƒ=k

|2 + Σ Σ |(wn )H hn

j,b

j,l,k

l,k

|2 + σ2

≥ γl,k,

n l,k

∀n ∈ N , ∀l ∈ L, ∀k ∈ K, (3.41c)

Σ s

∈ {0, 1}, ∀n ∈ N , ∀l ∈ L, ∀k ∈ K, (3.41d)

n l,k

≤ 1, ∀l ∈ L, ∀k ∈ K. (3.41e)

n∈N

step size -1, we choose 0 < ǫ < 1/(

l∈L Pl). This choice of ǫ implies that the maxi-

In the cost function (3.41a), the first term is the total transmit power scaled by a pos- itive constant ǫ and the second term counts the total number of admitted MSs. Since the total transmit power is boundedΣby l∈L Pl and the second term is discrete with

mum possible number of MSs will be scheduled and no other solution that schedules the same number of MSs can operate with less power [26]. Constraint (3.41c) defines

l,k

N inequalities for each MSl,k. When sn

= 1, the inequality is a standard SINR

l,k

constraint; when sn

= 0, the inequality does not impose any constraint on {wn }

l,k

provided that Mn

is large enough to satisfy the inequality for all possible values

of {wn }. By considering the power constraint (3.41b) and the ▇▇▇▇▇▇-▇▇▇▇▇▇▇

inequality, we choose the value of Ml,k as Ml,k ≥ γl,k

j∈L Pj ǁhj,l,kǁ

+ γl,kσl,k.

l,k

n n Σ

n 2 2

l,k

Constraint (3.41e) makes sure that each MS is scheduled in at most one subchan- nel. Note that, when all the binary variables {sn } are fixed, (3.41) is equivalent to

(3.40) in the two-stage formulation.

Constraint (3.41c) for each n, l, k can be formulated as a SOCP constraint,

[(un )T , σ

] ≤

.(1 +

1 )(wn )Hhn +

l,k (1 − sn ), (3.42)

l,k

γl,k

l,k

l,l,k

γl,k

l,k

where un

is a LK × 1 vector defined as un

= [(wn

H n

) h

1,l,k

, · · · , (wn

H n L,l,k

]T ,

l,k

) h

and (wn )Hhn

is constrained to be real valued and positive. By replacing (3.41c)

1,1

L,K

l,▇ ▇,l,k

with (3.42), we can transform (3.41) into a MI-SOCP problem, which can be solved,

e.g., by using the B&B solver CPLEX. The procedure of using CPLEX to solve this MI-SOCP problem is based on the relaxation of the binary variables. Specifically,

l,k

constraint (3.41d) is relaxed as 0 ≤ sn ≤ 1. Thus the MI-SOCP problem changes

into a SOCP problem of complexity O((LKNNt)3.5). However, the lower bound

l,k

found from relaxation in this problem is quite loose. This is because the chosen values of constants {Mn } are much larger than the sum of interference and noise

in (3.41c). Therefore, every constraint in (3.41c) can be satisfied with {sn } of such

values that the equality in (3.41e) becomes valid for every MS, i.e.,

n∈N sl,k = 1.

Σ l,k n

Thus the second term in the objective function (3.41a) will be LK, which seems like all the LK MSs are scheduled in the available subchannels, but in fact it is not. In the next section, we propose a customized B&B algorithm which avoids the relaxation. Our algorithm has tighter lower bound and also finds an upper bound with low complexity.

3.3.3 Branch and bound algorithm

In our B&B algorithm, we split the problem in (3.41) (i.e., the root) into sub- problems (nodes) by fixing a subset of the binary variables. We define a N × 1

j,b

binary vector si (i ∈ {1, · · · , LK}) for each MS, for example, si = [s1 , · · · , sN ]T

(∀j ∈ L, ∀b ∈ K). Because of the constraint (3.41e), si can be either a column of the identity matrix IN×N or an all-zero vector, so it has N + 1 possible combinations.

We can generate a tree with LK levels, where each level corresponds to a specified MS. The original problem (3.41) can be split into N + 1 nodes in the first level by fixing s1. Each of those nodes can be further split in the second level by further fixing s2, etc, thus generating (N + 1)LK nodes in the last level. Solving the SOCP problem for each and every node at the last level corresponds to the enumeration method which has prohibitive complexity. For this reason, we would like to prune nodes in the tree early on without going all the way down to the last level and we show how to achieve this in the following.

For a node in the tree, we calculate a lower bound (LB) and an upper bound (UB) for the optimal value of (3.41a). If the LB of a node is higher than the global upper bound (GUB), i.e., the tightest UB from all nodes already examined, this node and all its children nodes can be safely pruned without loss of optimality. This is because all children nodes are further restrictions of their parent node (each child node has one more binary vector fixed relative to its parent node), implying that the LB of a child node must be greater than or equal to the LB of its parent node. This implicit elimination is the key to computational savings, and it can be very effective if substantial pruning happens early in the process.

In order to make the algorithm more efficient, we need to define a proper order to fix the binary vectors of the MSs, i.e., for which MS it is fixed in the first level, which is fixed in the second level, etc. Since our primary objective is to maximize the number of scheduled MSs within the limited total transmit power and satisfying the individual SINR constraints, we have an intuition that the MS requiring small transmit power to satisfy its SINR constraint is more likely to be admitted, so we fix the binary vector of such a MS in a early level. Although a SOCP has to

l,k

be solved to determine the required transmit power ǁwn ǁ2 satisfying Γn ≥ γl,k,

we can find a LB for it by considering the interference-free case. The minimum

transmit power satisfying the QoS constraint can be find through the maximum

ratio transmission (MRT) as pn

= γl,kσ2

/ǁhn ǁ and the related beamforming

l,l,k

vector is vl,k =

pl,khl,l,k/ǁhn

n √ n n l,k

l,k

l,l,k

ǁ. We refer to pl,k and vl,k as MRT power and

n n

l,k

MRT beamformer, respectively. We calculate every pn

l,k

P = [p1,1, · · · , p1,K, · · · , pL,1, · · · , pL,K], where pl,k = [p1

to get a N × LK matrix

l,k

, · · · , pN ]T . Then we find

the minimum element in each column of P and sort them into a LK 1 vector in an

ascending order. This vector gives the order that we need, i.e., the MS corresponds to the i element of this vector will be fixed in the ith level.

l,k

{ }

· · · · · ·

Then we introduce how to calculate the UB. For a specified node in the ith level, we have a subset of i fixed binary vectors s1 si. By assuming all the other si+1 sLK to be all-zero vectors (i.e., not scheduled in any subchannel), we get a fixed set of binary variables sn , and the node corresponds to a SOCP problem (3.40). The complexity of this SOCP problem is O((iNt)3.5), since there are only i SINR constraints (3.39c). However, if using the relaxation method as in CPLEX, the same node corresponds to a SOCP problem with complexity O(((i + N (LK − i))Nt)3.5),

because of the additional N (LK − i) constraints (3.41c) and variables. If the SOCP

problem of the node is infeasible, this node can be pruned directly. Otherwise, we get a solution which can be used as a UB for this node. However, this UB is a loose one because we have assumed all the other unfixed MSs are not scheduled. We can tighten this UB by scheduling more MSs while keeping those already scheduled ones. We implement this by extending the low-complexity admission-control method in

[28] into our multi-subchannel model. In each subchannel, we keep the spatial signatures of beamformers for the already admitted MSs, but adjust their power (under the power limit) to schedule a new MS, and the beamforming vector for the new MS is calculated while satisfying all the constraints. This process is repeated until no more MS can be scheduled. In this way, we find a relatively tight UB avoiding to solve additional SOCPs.

· · ·

For the LB, we keep the solution from the loose UB. Since the beamforming vectors for those MSs with the fixed binary vectors s1 si were calculated, we know how much power was spent in each cell. With the remaining power budget, we try to schedule as many other MSs as possible by considering their MRT power under the power constraint, i.e., assuming the interference-free case for the other MSs. Since the MRT power is the least power required to admit a MS, we get a LB for the node. This LB is tighter than that of the relaxation method.

We also find an initial LB and an initial UB before splitting the original prob- lem into subproblems. We calculate the initial LB by minimizing the objective function (3.41a) under the constraints (3.41b), (3.41d) and (3.41e) while assuming

n 2 n

ǁwl,kǁ = pl,k. For the initial UB, we first schedule one MS in each subchannel,

which corresponds to the smallest element in each row of the MRT power matrix

P. Then we schedule more MSs with the same method as we do in tightening the

UB.

Our proposed optimal algorithm using B&B is summarized in Algorithm 2. We denote the node that selected to be split as s-node and we use a stack to keep track of nodes that require further examination. In step 1, if the calculated initial LB and initial UB are equal, we terminate the algorithm, and the optimal solution to the problem is obtained from the initial UB. Otherwise, we initialize s-node to be the root and stack to be empty, set GUB to be equal to the initial UB and go to step 2. In step 2, we implement the depth-first search. Specifically, a node is split into (N + 1) children nodes, but only the one with the lowest LB is further split into the next level, while any other unpruned nodes are inserted in the stack. We repeat step 2 and 3 until the stack becomes empty. The final optimal solution is obtained from the GUB.

For large size problems, the complexity of Algorithm 1 can be very high because a large number of nodes might be generated in step 2 and 3. Therefore, we can find a sub-optimal solution by implementing a fixed number of depth-first searches in Algorithm 1. If Q searches are implemented, the maximum number of nodes generated is (N + 1)LKQ.

Algorithm 2 Proposed optimal algorithm using B&B

1. Calculate an initial LB and an initial UB.

• ← ← ∅

If they are equal, terminate; else, initialize s-node root, stack ,

←

GUB initial UB, and go to 2.

2. Implement the depth-first search one time.

(a) Split s-node into N + 1 new nodes, and for every new node, solve a SOCP problem (3.40).

•

If it is infeasible, prune the node; else, calculate LB. If LB > GUB, prune the node; else, tighten UB.

• ←

If UB < GUB, GUB UB.

←

(b) s-node the one has the lowest LB in the unpruned new nodes, insert the other ones in stack, go to (a).

• ∅

3. Delete the nodes whose LB > GUB in stack.

• ∅ ←

If stack = , terminate; else, s-node the node with the lowest LB in stack, and go to 2.

We give an example to illustrate Algorithm 1 in Figure 3.4, where 2 cells with 2 MSs in each cell and 2 subchannels are considered. The branching order is that, the binary vector for MS1,1 is fixed in level 1, followed by MS1,2 in level 2, MS2,2 in level 3 and MS2,1 in level 4. The nodes are numbered in the same order as they are generated. Since the initial UB and the initial LB are not equal to each other, we split the root into three nodes in level 1 by fixing s1. Then node 1 and 3 are pruned because their LB are higher than GUB and we do not need to tighten their UB. Since node 2 is the only node left, it is split in level 2 where s2 is further fixed. In level 2, node 5 and 6 are pruned and node 4 is split into level 3. In level 3, GUB is updated to be equal to the UB of node 7, node 9 is pruned, node 8 is split into level 4 and node 7 is inserted in stack. In level 4, nodes 10, 11, and 12 are all pruned since their LB are higher than the updated GUB, and therefore node 8 is also pruned. Next, take node 7 out of stack and split it. Then nodes 13 and 15 are pruned. Now all the nodes in the tree, except node 14 and its parents (nodes 7, 4, and 2), have been pruned and stack becomes empty. Therefore, the optimal binary vectors in this example are s1 = [1 0]T , s2 = [0 1]T , s3 = [0 1]T and s4 = [1 0]T , and the optimal beamforming vectors are also obtained when calculating the GUB. In this example, we find the optimal solution by generating 15 nodes, while by using the relaxation method with CPLEX there are 26 nodes generated, and by brute-force searching we have to solve all 34 nodes. For larger size problems, we can even save more complexity.

ROOT

initial UB = −3.4516 initial LB = −3.4747

S1 = [0 1] T

S1 = [1 0] T

UB = −3.4516

LB = −3.4747

S1 = [0 0] T

LB = −3.4139

LB = −2.525

Level 1

S2 = [0 1] T

UB = −3.4516

LB = −3.4747

S2 = [1 0] T

S2 = [0 0] T

LB = −3.4309

LB = −2.529

Level 2

S3 = [0 1] T

UB = −3.4622

LB = −3.4707

S3 = [1 0] T

UB = −3.392

LB = −3.4745

S3 =T [0 0]

LB = −2.5464

Level 3

S4 = [0 1] T

S4 = [1 0] T

UB = −3.4622

LB = −3.4622

S4 = [0 0] T

S4 = [0 1] T

S4 = [1 0] T

S4 = [0 0] T

S1 = [1 0] T

S2 = [0 1] T

S3 = [0 1] T

S4 = [1 0] T

LB = −3.4036

LB = −2.582

LB = −3.4516

LB = −3.4501

LB = −2.582

Level 4

Figure 3.4: An example for the B&B algorithm

3.3.4 Simulations

In our simulations, we consider the overlapping multicell networks from two opera- tors. As shown in Figure 3.5, two strategies for the BSs deployment are involved, namely cositing and non-cositing. In the cositing strategy, the BSs of the two op- erators share the same mast or located very closely; whereas in the non-cositing strategy, the BSs of one operator are located apart from those of the other. The network of each operator has 7 hexagonal sectors and each sector has one BS. An inter-site distance of 500 meters is considered. Each operator has K MSs randomly located within the center sector. The number of available subchannels is N . The radio channel traces are generated using the quasi-deterministic radio channel gen- erator (QuaDRiGa) [29] recently released by ▇▇▇▇▇▇▇▇▇▇ ▇▇▇▇▇▇▇▇ ▇▇▇▇▇ Institute. The WINNER+ model [30] is used with the major parameters summarized in TA- BLE 3.2.

We consider two spectrum allocation scenarios, i.e., sharing and non-sharing. In the sharing scenario, the N subchannels are shared by both operators; while in the non-sharing scenario, each operator exclusively uses N/2 subchannels. Moreover, we assume that only the two center sectors (each from one operator) are coordinated, whereas all the surrounding sectors are merely included to establish a more realistic interference environment. Thus the interference from the surrounding sectors will

Operator & Operator 2

Operator 1

Operator 2

COSITING NON−COSITING

Figure 3.5: Overlapping multicell networks from two operators

Parameter name	Parameter value
According to WINNER+ scenario	C2 NLOS
Carrier frequency (GHz)	2.6
Path loss (dB)	35.05 log(d) + 36.70
Intra / inter-site correlation	1 / 0.5
Number of BSs	14
Number of antennas BSs / MSs	4 / 1
Heights of BSs / MSs (m)	32 / 2
Antenna pattern of BSs	KATHREIN 80010541 (10 degrees down tilt)
Antenna pattern of MSs	omni-directional

Table 3.2: Propagation parameters

not be mitigated during the optimal scheduling and beamforming.

We assume the two coordinated sectors have the same power budget, i.e., Pl = P , (l = 1, 2), in both the sharing and non-sharing scenarios. The transmit power in each of the surrounding sectors is assumed to be αP , where α is an activity factor that determines the interference strength from the surrounding sectors. Moreover, the power αP is evenly distributed among the available subchannels. Therefore, the transmit power at each subchannel in each surrounding sector is αP/N for the sharing scenario and 2αP/N for the non-sharing scenario. Note that the number of interfering sectors is different for the sharing and non-sharing scenario. In the shar- ing scenario, all 12 surrounding sectors from both operators generate interference; while in the non-sharing scenario only 6 surrounding sectors from the same opera- tor generate interference. The interference from jth interfering sector to kth MS of

2 n H n

j,i,k

ith operator in nth subchannel is (αP/N )|(wn)Hhn | and (2αP/N )|(w ) h |

for the sharing and non-sharing scenario, respectively. In the calculations, we use a

random transmit beamforming vector wn, which has uniform norm (i.e., ǁwnǁ = 1),

j j

for the jth surrounding sector and nth subchannel.

We assume the SINR targets for all the MSs are the same, i.e., γl,k = γ. The power budget for the two coordinated sectors is 0.4 W per subchannel, i.e., P = 0.4N W. By using the proposed B&B algorithm, we find the total number of scheduled MSs of the two operators in the sharing and non-sharing scenarios for both the cositing

and non-cositing networks. Assume the total number of scheduled MSs is K′, the throughput (i.e., sum rate of the K′ MSs), can be calculated as K′ log2(1+γ), where γ is the SINR target. We define the SAPHYRE gain to be the ratio of throughput

between the sharing scenario and the non-sharing scenario. In the following, we show a small-scale example and a large-scale example, where different number of N and K are considered. In each example, 10 realizations of the MSs positions are simulated and the results are averaged.

In the small-scale example, we assume only two subchannels are available and each operator has 10 MSs, i.e., N = 2, K = 10. The power budget for the two coordinated sectors is 0.8 W. In this example, for the sharing scenario, we find a suboptimal solution by implementing 5 times depth-first search in Algorithm 2; while for the non-sharing scenario, we find the optimal solution (due to the low complexity with small values of N and K). Regarding to the interference from surrounding sectors, we consider three cases: (a) no interference from surrounding sectors, i.e., α = 0;

(b) relatively weak interference with α = 0.1; (c) relatively strong interference with α = 0.5. The average number of scheduled MSs and average throughput for various SINR targets for the three cases are shown in Figure 3.6 - 3.11.

We first look at the case of α = 0 in Figure 3.6 and 3.7. For the non-sharing scenario, almost the same number of MSs are scheduled in both the cositing and non-cositing network. The number of scheduled MSs in the non-sharing scenario is 8 when SINR

> 8 dB. This is because each BS has 4 antenna and a maximum number of 4 MSs can be served in each subchannel with large SINR. For the sharing scenario, a larger

number of MSs can be scheduled in both networks and the numbers decrease as SINR increases. Moreover, the number of scheduled MSs in non-cositing network is larger than that in the cositing network. We find the maximum SAPHYRE gain is more than 1.5 for the non-cositing network and about 1.25 for the cositing network, both of which are achieved at SINR = 8 dB. As the SINR increases, the gain becomes smaller. At SINR = 20 dB, the SAPHYRE gain becomes about 1.25 for the non-cositing network, and it becomes 1 (i.e., no gain) for the cositing network.

The results for the case of α = 0.1 are given in Figure 3.8 and 3.9. For the non- sharing scenario of both networks, the number of scheduled MSs does not keep at 8, but decreases when the SINR becomes larger and larger due to the extra interference from the surrounding sectors. For the sharing scenario, a larger number of MSs are scheduled in the non-cositing network than in the cositing network, and for both networks the number decreases as SINR increases. Again, in this case the SAPHYRE gain in the non-cositing network is higher than that in the cositing network.

For the case of α = 0.5 in Figure 3.10 and 3.11, due to the relatively high interference from the surrounding sectors, the number of scheduled MSs for both networks and both scenarios drops quickly as the SINR increases. Moreover, the SAPHYRE gain becomes smaller in both networks compared with the cases of α = 0 and α = 0.1.

Next, we consider a large-scale example where 10 subchannels are available and each operator has 50 MSs, i.e., N = 10, K = 50. In this example, the power budget for the two coordinated sectors is 4 W, and we only consider the case of small interference from surrounding sectors, i.e., α = 0.1. Moreover, due to the large value of N and K, for both the sharing and non-sharing scenario we find a suboptimal solution by implementing only one time depth-first search in Algorithm 2. The results are given in Figure 3.12 and 3.13, where we find the curves have the similar trend as those in Figure 3.8 and 3.9. In the non-sharing scenario, the numbers of scheduled MSs are almost the same in both networks. Specifically, with SINR between 6 and 16 dB, 40 MSs are scheduled, and the numbers decrease by further increasing the SINR. The maximum SAPHYRE gain is larger than 1.5 for the sharing scenario and it is larger than 1.25 for the non-sharing scenario, both of which are achieved at SINR = 6 dB. As SINR increases, the SAPHYRE gain becomes smaller in both networks.

In conclusion, we have found that a larger SAPHYRE gain can be achieved in the non-cositing network than in the cositing network. In general, the gain becomes smaller as the SINR targets increases. In our simulations, the maximum SAPHYRE gain achieved is about 1.5 in the non-cositing network and about 1.25 in the cositing network. This achievement is under the condition that the size of the user pool is 5 MSs per subchannel, and a suboptimal solution is found for the sharing scenario. We can expect a greater gain, if we have a larger size pool of MSs to choose from, or the optimal solution is found.

Cositing: sharing

Average number of scheduled MSs

16 Cositing: non−sharing

Non−cositing: sharing

14 Non−cositing: non−sharing

0 5 10 15 20 25 30

SINR (dB) target

Figure 3.6: Average number of scheduled MSs with N = 2, K = 10, α = 0

Average throughput (bps/Hz)

30 Cositing: sharing

Cositing: non−sharing

20 Non−cositing: sharing

Non−cositing: non−sharing

0 5 10 15 20 25 30

SINR (dB) target

Figure 3.7: Average throughput with N = 2, K = 10, α = 0

Cositing: sharing

Average number of scheduled MSs

16 Cositing: non−sharing

Non−cositing: sharing

14 Non−cositing: non−sharing

0 5 10 15 20 25 30

SINR (dB) target

Figure 3.8: Average number of scheduled MSs with N = 2, K = 10, α = 0.1

Average throughput (bps/Hz)

20 Cositing: sharing

Cositing: non−sharing

15 Non−cositing: sharing

Non−cositing: non−sharing

0 5 10 15 20 25 30

SINR (dB) target

Figure 3.9: Average throughput with N = 10, K = 50, α = 0.1

Cositing: sharing

Average number of scheduled MSs

16 Cositing: non−sharing

Non−cositing: sharing

14 Non−cositing: non−sharing

0 5 10 15 20 25 30

SINR (dB) target

Figure 3.10: Average number of scheduled MSs with N = 2, K = 10, α = 0.5

Average throughput (bps/Hz)

Cositing: sharing

5 Cositing: non−sharing Non−cositing: sharing

Non−cositing: non−sharing

0 5 10 15 20 25 30

SINR (dB) target

Figure 3.11: Average throughput with N = 2, K = 10, α = 0.5

Cositing: sharing

Average number of scheduled MSs

80 Cositing: non−sharing

Non−cositing: sharing

70 Non−cositing: non−sharing

0 5 10 15 20 25 30

SINR (dB) target

Figure 3.12: Average number of scheduled MSs with N = 10, K = 50, α = 0.1

300

Average throughput (bps/Hz)

250

200

150

100

0 5 10

Cositing: sharing Cositing: non−sharing Non−cositing: sharing

Non−cositing: non−sharing

15 20 25 30

SINR (dB) target

Figure 3.13: Average throughput with N = 10, K = 50, α = 0.1

4 Distributed MIMO Signal Processing and Resource Allocation

4.1 Cooperative Beamforming

In this section, a distributed beamforming algorithm is proposed for the two-user multiple-input single-output (MISO) interference channel (IC). The algorithm is it- erative and uses as bargaining value the interference that each transmitter generates towards the receiver of the other user. It enables cooperation among the transmit- ters in order to increase both users’ rates by lowering the overall interference. In every iteration, as long as both rates keep on increasing, the transmitters mutually decrease the generated interference. They choose their beamforming vectors dis- tributively, solving the constrained optimization problem of maximizing the useful signal power for a given level of generated interference. The algorithm is equally ap- plicable when the transmitters have either instantaneous or statistical channel state information (CSI). The difference is that the core optimization problem is solved in closed-form for instantaneous CSI, whereas for statistical CSI an efficient solution is found numerically via semidefinite programming. The outcome of the proposed algorithm is approximately Pareto-optimal. Extensive numerical illustrations are provided, comparing the proposed solution to the ▇▇▇▇ equilibrium, zero-forcing, ▇▇▇▇ bargaining, and maximum sum-rate operating points.

4.1.1 System model and preliminaries

We assume that transmission consists of scalar coding followed by beamforming1 and that all propagation channels are frequency-flat. The matched-filtered symbol- sampled complex baseband data received by RXi is modeled as2

▇▇

▇▇ = hHwisi + hHwjsj + ei, j ƒ= i, i, j ∈ {1, 2}, (4.1)

where si ∼ CN(0, 1) and wi ∈ Cn are the transmitted symbol and the beamforming vector, respectively, employed by TXi. Also, ei ∼ CN(0, σ2) models the receiver noise. The (conjugated) channel vector between TXi and RXj is modeled as hij ∼ CN(0, Qij). We denote rij ¾ rank{Qij}. In the case of instantaneous CSI, TXi accurately knows the channel realizations hii and hij, whereas for statistical CSI it

only knows the channel covariance matrices Qii and Qij.

1This is optimal in the case of instantaneous CSI, but not necessarily for statistical CSI, see [31]. 2Whenever an expression is valid for both systems, it is denoted once with respect to system i and the index j ƒ= i refers to the other system.

The transmission power is bounded due to regulatory and hardware constraints, such as battery and amplifiers. Without loss of generality, we set this bound to 1. Hence, the set of feasible beamforming vectors is

W ¾ {w ∈ Cn | ǁwǁ2 ≤ 1}. (4.2)

Note that the set W is convex. In what follows, a specific choice of wi ∈ W is denoted as a transmit strategy of TXi.

When the transmitters perfectly know the channel vectors and the receivers treat interference as noise, the achievable instantaneous rate (in bits/channel use) for link i is [32]

Ri(wi, wj) = log2

.1 +

|hii wi|

H 2

|hHwj|2 + σ2

. (4.3)

It is evident that the rate on each link depends on the choice of both beamforming vectors. We define the power that RXi receives from TXj as

pji(wj) ¾ |hHwj|2 = wHhjihHwj. (4.4)

Then, we can write (4.3) as

R (w , w ) = log

.1 + pii(wi) Σ , (4.5)

(w ) + σ2

which is monotonously increasing with the useful signal power pii(wi) for fixed received interference power pji(wj) and monotonously decreasing with pji(wj) for fixed pii(wi).

The main goal of the bargaining algorithm we introduce in Section 4.1.2 is to agree on a PO solution. Hence, we restrict our attention to the beamforming vectors which are candidates to achieve PO points. From [32], we know that the PO beamforming vectors use full power and that they are linear combinations of the MR and ZF strategies

λiwMR + (1 − λi)wZF

wPO(λi) = i i

(4.6)

ǁλiwMR + (1 − λi)wZFǁ

for λi ∈ [0, 1], where

h

Π⊥ hii

MR ii

ZF hij

wi =

ǁhii

and wi =

⊥

Π

hij

hii

. (4.7)

The outcome when both transmitters use their MR strategies is the NE. When both use their ZF we refer to the ZF point.

When the transmitters only have statistical knowledge of the channels, it is natural to design the achievable the beamforming vectors with respect to the ergodic rates,

which are obtained by averaging over the channel realizations. From [33], we have3

Ri(wi, wj) ¾ E

.log2

.1 +

|hii wi|

H 2

ΣΣ

|hHwj|2 + σ2

ji i

(4.8)

= p ii(wi) fi(pii(wi)) − fi(pji(wj)),

where

ln 2 pii(wi) − pji(wj)

i i

f (x) ¾ eσ2/x

∞

∫

σ2/x

e−t

dt. (4.9)

In (4.8), pji(wj) denotes the average power that RXi receives from TXj

pji(wj) = E.wHhjihHwjΣ = wHQjiwj. (4.10)

Note that the final terms in both (4.4) and (4.10) are convex homogeneous quadrat- ics. The difference is that the parameter (channel) matrix in (4.4) is rank-1 by definition, whereas in (4.10) it can have any rank.

The ergodic rate (4.8) has the same behavior as the instantaneous rate (4.3), i.e., it is monotonously increasing (decreasing) with pii(wi) (pji(wj)) for fixed pii(wi) (pji(wj)) [33]. Also, for points on the Pareto boundary we know that wi ∈ R{Qii, Qij} [31]. The MR strategy wMR is the dominant eigenvector of Qii

[33]. When R{Qii} ¢ R{Qij}, the ZF strategy wZF is the dominant eigenvector of

wZF = 0 [33].

ΠN{Qij }QiiΠN{Qij } and when R{Qii} ⊆ R{Qij}, e.g., when Qij is full-rank, then

In the following, we introduce some operating points, which are important in the sense that they lie on the outer boundary of the rate region; see, e.g., [34] and references therein.

Single-user (SU): The points achieved when one transmitter employs its MR strat- egy while the other refrains from transmission.

−

Maximum sum-rate (SR): The point where the sum of the rates is maximum. Graphically, it is the point where a line of slope 1 touches the Pareto bound- ary of the rate region.

▇▇▇▇ bargaining solution (NBS): The outcome of a ▇▇▇▇ bargaining is a point (R¯1, R¯2) such that (R¯1 − R∗)(R¯2 − R∗) is maximized for some threat point (R∗, R∗)

and R¯i ≥ R∗. It is natural to use the NE as the threat point, since it is the

only reasonable outcome if the systems are not able to agree on a solution. The

NBS is only defined on convex utility regions, but we will call the solution to the corresponding optimization problem the NBS.

3We deliberately use the same symbols, as in the case of instantaneous CSI, to denote the rate and the power (R and p, respectively) in order to facilitate in the sequel a uniform treatment of both CSI scenarios.

4.1.2 Proposed algorithm

Here we elaborate the proposed bargaining algorithm that enables the transmitters to distributively design their beamforming vectors. We assume that there exists a feedback link from every receiver to every transmitter. The receivers use these links to feedback CSI. Each transmitter has CSI only on the links it is affecting. The transmitters are assumed synchronized, but no information (CSI or user data) is exchanged between them.

In the algorithm, we use as bargaining value an upper bound on the interference generated by system i to system j. This bound, denoted cij, is adjusted in every iteration. During the bargaining, the receivers feed back a one-bit message that tells the transmitters whether the iteration was successful or not, i.e. whether the rates increased or not. We denote l the iteration counter, which also acts as a quantitative measure of the overhead (total number of bits per RX-TX feedback link) and the computational complexity (total number of optimization problems that need to be solved).

A flowchart of the algorithm is illustrated in Figure 4.1. The first step of the algorithm is the decision whether the initialization point will be the NE or the ZF point. For this reason, the transmitters send two pilots using their MR and ZF beamforming vectors. The receivers measure the SINR for each transmission and feed back one-bit of information telling the transmitters which strategy yields higher SINR, hence rate. If Ri(wZF, wZF) ≥ Ri(wMR, wMR) for both systems, the

algorithm is initialized with the ZF point, since it is closer than the NE to the

Pareto boundary. Hence, the algorithm will require fewer iterations to converge to a solution. If only one system achieves higher rate with the ZF strategies, there is no incentive for the other to accept the ZF point as initial point. The algorithm is then initialized with the NE point.

Then, the algorithm sets the stepsize for updating cij. As with any iterative algo- rithm, the best output is obtained for an infinitesimal stepsize. However, this is not practical, so we consider instead a fixed stepsize4. We assume that TXi samples the interval [0, pij(wMR)] uniformly in N + 1 points, to allow up to N iterations. This gives the step δij = ±pij(wMR)/N . The sign of δij depends on the initial point. If the algorithm is initialized with the NE, δij will be negative (decreasing interfer- ence). Otherwise, δij will be positive (increasing interference). At iteration l, TXi

updates the interference level as cl

= cl−1 + δij and solves the problem

max

wi∈W

pii(wi) (4.11)

s.t. pij(wi) ≤ cl . (4.12)

The optimal solution of problem (4.11)–(4.12) is the beamforming vector which maximizes the useful power given that the generated interference is cl . As long

4Also, an adaptive stepsize can easily be incorporated to the algorithm.

as cl is chosen in the range [0, pij(wMR)], there always exists a feasible solution to

(4.11)–(4.12) [35]. The lower and upper end on the interference level correspond to the ZF and MR strategies, respectively. Furthermore, the bound will be tight at the optimum; hence, the inequality in (4.12) can be equivalently replaced with equality. We propose a solution to the optimization problem (4.11)–(4.12) in the following for the case of instantaneous and statistical CSI, respectively. TXi uses wl to transmit

a pilot. RXi measures Ri(wl, wl ) and if it is no smaller than Ri(wl−1, wl−1), it feeds

i j i j

back a one-bit message telling the transmitters to continue updating the interference

level. As soon as the rate decreases for at least one of the receivers, the algorithm terminates and the transmitters will use the beamforming vectors from the previous iteration.

Use wl−1

R (w , w ) ≥ R (w , w )

Yes

R (w , w ) ≥ R (w , w )

MR MR

2 1

R (w , w ) ≥ R (w , w )

l−1

1 2

Yes

R (w , w ) ≥ R (w , w )

l−1 l−1

2 1

Transmit pilot using ▇▇

▇ ▇

w ← sol. of (4.11)–(4.12)

= cl−1 + δij

l ← l + 1

δij = pij(wMR)/N

i i

c1 = 0

l = 1

w1 = wZF

p (w )/N

δ = −

i i

c1 = pij(wMR)

l = 1

w1 = wMR

Transmit pilots using wMR and wZF

Figure 4.1: Flowchart describing the proposed cooperative beamforming algorithm

We claim that the algorithm is self-enforced. Suppose that, in one of the steps, TXi chooses to cheat by not decreasing the interference level. Then, the rate of system j will decrease and RXj will feedback a negative bit. According to the last step of the algorithm, the transmitters are expected to choose the beamforming vectors from the previous iteration. If TXi does not, RXj will notice and report it to TXj. Then, TXj will leave the bargaining and employ its MR beamforming vector instead. That is, if one system tries to cheat, then the cooperation is canceled and the operation falls back to the NE (the so-called threat point, in the context of ▇▇▇▇ bargaining).

We first consider the instantaneous CSI. By inserting the expression (4.4) in (4.11)– (4.12), with inequality changed to equality, we get the problem

max

wi∈W

|hii wi| (4.13)

H 2

s.t. |hHwi|2 = cij. (4.14)

H PO 2

Since the objective of the algorithm is to find a PO point, the transmitters are only willing to use beamforming vectors that are candidates for achieving PO points. Any other beamforming vector will be a waste of power. Using (4.6) we get

λi∈[0,1]

max

|h w (λi)| (4.15)

s.t. |hHwPO(λi)|2 = cij. (4.16)

Note that the optimization (4.15)–(4.16) is now only with respect to the real scalar λi. Furthermore, the power constraint is obsolete, since the PO beamforming vectors use full power. That is, the inequality constraint in (4.2) is met with equality. Instead, we have a constraint on the range of the weighting factor λi. Finally, it is straightforward to see that the objective function (4.15) is monotonously increasing with λi. Thus, we can equivalently rewrite (4.15)–(4.16) as

max

λi∈[0,1]

λi (4.17)

s.t. |hHwPO(λi)|2 = cij. (4.18)

To simplify notation, we define

hij

αi ¾ (|hHhii|/ ǁhijǁ)2 and βi ¾ Π⊥ hii / ǁhiiǁ . (4.19)

The values (4.19) are only calculated once per channel realization. For cij > 0 we write (4.18) as

ij ⇔

λ2αi

= c

λ2 + (1 − λi)2 + 2λi(1 − λi)βi

λ2(αi/cij + 2βi − 2) + λi(2 − 2βi) − 1 = 0.

When cij = 0, the ZF strategy is the optimal solution (i.e, λi = 0). Now, we write (4.17)–(4.18) as

▇▇▇ ▇▇ (4.20)

s.t. λ2(αi/cij + 2βi − 2) + λi(2 − 2βi) − 1 = 0, (4.21)

0 ≤ λi ≤ 1. (4.22)

The solution to (4.20)–(4.22) is the largest of the two solutions to (4.21) that satisfies (4.22).

Then we consider the statistical CSI. By inserting (4.10) in (4.11)–(4.12) we get

max

wi∈Cn

wHQiiwi (4.23)

s.t. wHQijwi ≤ cij, (4.24)

wHwi ≤ 1. (4.25)

Problem (4.23)–(4.25) is a quadratically constrained quadratic program (QCQP). The feasibility set determined by (4.24)–(4.25) is convex. However, the optimization is non-convex owing to the form of the objective function. However, it can still be solved optimally and efficiently using semidefinite relaxation. This is because semidefinite relaxation is tight for QCQP problems of the form in (4.23)–(4.24), as shown in [36].

We briefly elaborate the procedure, similar to the way we did in [35]. We change the optimization variables to Wi ¾ wiwH. Note that

Wi = wiwH ⇔ Wi ≤ 0 and rank{Wi} = 1. (4.26)

{ } { }

Using (4.26) and the property that tr YZ = tr ZY for matrices Y, Z of com- patible dimensions, the average power term in (4.24) can be written as

wHQijwi = tr .wHQijwiΣ = tr .QijwiwHΣ

= tr {QijWi} . (4.27)

Due to (4.26) and (4.27), we equivalently recast (4.23)–(4.24) as

max

W∈Cn×n

tr {QiiWi} (4.28)

s.t. tr {QijWi} ≤ cij, (4.29) tr {Wi} ≤ 1, (4.30)

Wi ≤ 0, (4.31)

rank{Wi} = 1. (4.32)

The objective function (4.28), the constraints (4.29) and (4.30) are linear. The cone of positive semidefinite matrices (4.31) is convex. But the rank constraint (4.32)

is non-convex. Dropping it, the remaining problem (4.28)–(4.31) is a semidefinite programming (SDP) problem, which can be solved efficiently. Due to the absence of (4.32), the SDP problem will not necessarily return rank-1 optimal matrices. We experienced through extensive simulations that it actually does yield rank-1 matrices.

4.1.3 Numerical illustrations

We present extensive simulation results to evaluate the performance of the algorithm we propose. We focus on the case of statistical CSI, but also provide some results for instantaneous CSI. Firstly we explain how we generate CSI (i.e., channel covariance matrices or channel vectors) for simulation purposes. Then we compare the outcome of the algorithm to the NE, ZF, SR, and NBS, followed by showing exemplary bargaining trajectories. Finally, in we illustrate the SAPHYRE gain. Throughout the simulations, we assume that the transmitters use n = 5 antennas. We allow our algorithm run up to N = 20 iterations. The results reported in Figure 4.2–4.7 are averages over 100 ▇▇▇▇▇-▇▇▇▇▇ (MC) runs. Figure 4.2–4.7 illustrate the sum of the transmission rates, i.e., R1 + R2. Figure 4.8–4.11 show examples of achievable rate regions, i.e., for a single CSI realization.

We generate the direct and the crosstalk channels in two different ways, to model the scenarios of weak or strong spatial correlation. Specifically, in the case of in- stantaneous CSI and weak correlation, we generate the channel vectors hii and hij drawing independent samples from CN(0, I). For the scenario of strong correlation, we use the formula

hij = µihii + .1 − µ2h˜ij , (4.33)

∈

where hii and h˜ij are drawn from CN(0, I), and µi [0, 1]. A value of µi close to 1 refers to the case of strong interference.

In the case of statistical CSI, we construct the covariance matrices, of rank r, randomly as

ΣQ = q q , (4.34)k

▇

▇=1

where qk ∼ CN(0, I). For the scenario of weak correlation, we generate the covari- ance matrices Qii and ▇▇▇ independently according to (4.34). For the scenario of strong correlation, we construct the matrices such that the angle between the eigen- vectors of the direct matrix and the eigenvectors of the crosstalk matrix is small. Assuming that rii ≤ rij, we first generate Qii as in (4.34). Then, we construct the

vectors {qij,k}k that define Qij as

qij,k = q˜ij,▇, ▇ > rii

(4.35)

. qij,k = µiqii,k + √1 − µ2q˜ij,▇, ▇ ≤ rii

∼ ∈

where q˜ij,k CN(0, I) and µi [0, 1]. If rii > rij, the matrices are constructed the other way around.

Here we provide results for statistical CSI, both for weak and strong spatial cor- relation. Also, we distinguish among the cases of having full-rank and low-rank covariance matrices. In the low-rank scenario, the covariance matrices of the direct- channels have rank r11 = r22 = 2, and covariance matrices of the cross-talk channels have rank r12 = r21 = 4. For strong correlation, we use µi = 0.8.

In Figure 4.2 and 4.3, we study the full-rank scenario. First, we note that the ZF

sum rate is equal to 0 since full-rank crosstalk matrices correspond to wZF

= 0.

Second, we see that the sum rates for the proposed algorithm, the NBS, and the NE saturate for high SNR. The reason is that when the SNR is high, interference is the main limiting factor. Since the interference cannot become zero, except for the SU-points, there should be a limitation. Third, since there is no interference at the SU points, the corresponding rates will grow unbounded with SNR and the SR will be found at a SU point for high SNR.

In Figure 4.4 and 4.5, we illustrate the low-rank scenario. Here, all points but the NE converge to the same sum rate at high SNR. The difference is that, for strong correlation they converge at higher SNR value than for weak correlation. Also, we see that the rates grow almost linearly with the SNR. In general, there exists a non- trivial zero-forcing point for the case of low-rank matrices. Using this, the noise is the only limitation. When the noise decreases, the rates increase. At low SNR, the ZF starts growing later for strong correlation than for weak correlation.

Furthermore, we evidence that weak correlation (Figure 4.2 and 4.4) gives higher rates for the proposed algorithm, the NE, and the NBS, than strong correlation (Figure 4.3 and 4.5). As a general remark, low SNR means operation in the noise- limited regime and all the rates but the ZF are almost the same.

Concluding, we see that the performance of the proposed algorithm is slightly below the NBS and close to the SR, except for full-rank matrices and high SNR. Most important, the algorithm performs consistently much better than the NE, which would be the outcome if there was no cooperation.

In Figure 4.6 and 4.7 we report the results for weak and strong correlation (µi = 0.9), respectively. We note that the curves behave similarly to the ones in Figure 4.4 and

4.5. The reason for this is that the case of instantaneous CSI can be regarded as a specific instance of the low-rank statistical CSI when all covariance matrices are rank-1.

Then we give examples of the bargaining trajectory, i.e., the rate points (marked with stars) reached at every iteration of the proposed algorithm. Here, the maximum number of iterations used is N = 10. Figure 4.8 and 4.9 illustrate the trajectories for statistical CSI with full-rank covariance matrices and SNR equal to 0 and 10 dB, respectively. The Pareto boundary is calculated using the technique proposed in [35]. Figure 4.10 and 4.11 illustrate the trajectories for instantaneous CSI and

SNR equal to 0 and 10 dB, respectively. The Pareto boundary is calculated using the technique proposed in [32].

For statistical CSI and full-rank matrices, the algorithm is always initialized with the NE, since a non-trivial ZF point does not exist. For instantaneous CSI and low SNR it is initialized with the NE point, while for high SNR with the ZF point. We note that the final outcome of the bargaining algorithm is close to the Pareto boundary, but does not necessarily lie on it. On one hand, the final outcome depends on the stepsize of the algorithm. On the other hand, the algorithm terminates when either of the rates stops increasing, i.e., when the tangent of the trajectory stops being positive. More on, the outcome is close to NBS, but generally far from SR. We note that the SR does not imply that both systems have increased their rates compared to NE, while the proposed algorithm and NBS guarantee that both systems get at least their NE rates. Finally, in all figures we show what the bargaining trajectory would look like if the algorithm went the entire way from one extreme point (NE or ZF) to the other with small steps. Note that for statistical CSI and full-rank matrices the ZF point corresponds to the origin of the rate region.

≈

At last, we illustrate the SAPHYRE gain of the proposed algorithm over the non- sharing situation as a function of the SNR. In Figure 4.2–4.7 we illustrate the sum- rate of the orthogonal (TDMA) sharing. We see, Figure 4.2 and 4.3, that TDMA sharing performs better than the proposed algorithm for the case of statistical CSI and full-rank covariance matrices and high SNR. Also, for statistical CSI with low- rank covariance matrices and medium SNR, Figure 4.5, the TDMA scheme gives higher rates than our algorithm. In Figure 4.12 and 4.13 we illustrate the absolute and fractional SAPHYRE gains, respectively. The illustrations are made for the weak spatial correlation scenario. For each SNR value we have made 100 ▇▇▇▇▇- ▇▇▇▇▇ simulations. The absolute SAPHYRE gain is defined as in (1.1), but for K = 2 users. We see that the absolute gain increases with increasing SNR for the scenarios of instantaneous CSI and statistical CSI with low-rank matrices. For statistical CSI the gain starts to decrease at SNR 5dB and becomes negative at SNR 18dB. We define the fractional SAPHYRE gain as in (1.2). In Figure 4.13, we see that the more we know about the channels, the higher is the fractional SAPHYRE gain. For the scenario of instantaneous CSI, we have a fractional gain close to 2.0, i.e., the rate is doubled. Again, we see that scenario of high SNR and statistical CSI with full-rank covariance matrices yield a fractional SAPHYRE gain ΞF < 1.

From the rate regions in Figure 4.8, 4.10, and 4.11 we see that the orthogonal (TDMA) sharing region lies inside the non-orthogonal sharing region. Also, we notice that the outcome of the proposed algorithm lies outside the TDMA region. Studying Figure 4.9 (statistical CSI and high SNR), we see that the TDMA region does not entirely contained in the non-orthogonal sharing region. The outcome of the algorithm lies south-west of the TDMA boundary. This illustrates why we get ΞF < 1.

Stat. CSI (full-rank), µi = 0, n = 5, N = 20, 100 MC

▇▇▇▇-Equilibrium

R1 + R2 [bits/channel use]

Zero-Forcing ▇▇▇▇-Bargaining Sum-Rate Cooperative-Algo

10 No sharing (TDMA)

-10 -5 0

5 10

SNR [dB]

15 20 25 30

Figure 4.2: Sum rate; stat. CSI (full-rank), µi = 0

Stat. CSI (full-rank), µi = 0.8, n = 5, N = 20, 100 MC 15

▇▇▇▇-Equilibrium

R1 + R2 [bits/channel use]

Zero-Forcing ▇▇▇▇-Bargaining Sum-Rate Cooperative-Algo

10 No sharing (TDMA)

-10 -5 0

5 10

SNR [dB]

15 20 25 30

Figure 4.3: Sum rate; stat. CSI (full-rank), µi = 0.8

Stat. CSI (low-rank), µi = 0, n = 5, N = 20, 100 MC

▇▇▇▇-Equilibrium

R1 + R2 [bits/channel use]

Zero-Forcing ▇▇▇▇-Bargaining

Sum-Rate

15 Cooperative-Algo

No sharing (TDMA)

-10 -5 0

5 10 15

SNR [dB]

20 25 30

Figure 4.4: Sum rate; stat. CSI (low-rank), µi = 0

Stat. CSI (low-rank), µi = 0.8, n = 5, N = 20, 100 MC 15

▇▇▇▇-Equilibrium

R1 + R2 [bits/channel use]

Zero-Forcing ▇▇▇▇-Bargaining Sum-Rate Cooperative-Algo

10 No sharing (TDMA)

-10 -5 0

5 10

SNR [dB]

15 20 25 30

Figure 4.5: Sum rate; stat. CSI (low-rank), µi = 0.8

Inst. CSI, µi = 0, n = 5, N = 20, 100 MC

▇▇▇▇-Equilibrium

R1 + R2 [bits/channel use]

Zero-Forcing ▇▇▇▇-Bargaining

20 Sum-Rate

Cooperative-Algo No sharing (TDMA)

-10 -5 0

5 10 15 20

SNR [dB]

25 30

Figure 4.6: Sum rate; inst. CSI, µi = 0

Inst. CSI, µi = 0.9, n = 5, N = 20, 100 MC

▇▇▇▇-Equilibrium

R1 + R2 [bits/channel use]

Zero-Forcing ▇▇▇▇-Bargaining

Sum-Rate

15 Cooperative-Algo

No sharing (TDMA)

-10 -5 0

5 10 15

SNR [dB]

20 25 30

Figure 4.7: Sum rate; inst. CSI, µi = 0.9

Statistical CSI; SNR = 0 dB, n = 5, µi = 0.5

Region-Boundary

TDMA-Boundary

Algo-Trajectory Algo-Steps Algo-Solution

▇▇▇▇-Equilibrium

Zero-Forcing Sum-Rate

▇▇▇▇-Bargaining

R2 [bits/channel use]

00 1

2 3 4

R1 [bits/channel use]

Figure 4.8: Bargaining trajectory; stat. CSI, SNR = 0 dB

Region-Boundary TDMA-Boundary

Algo-Trajectory Algo-Steps Algo-Solution

▇▇▇▇-Equilibrium

Zero-Forcing Sum-Rate

▇▇▇▇-Bargaining

Statistical CSI; SNR = 10 dB, n = 5, µi = 0.5

R2 [bits/channel use]

00 1

2 3 4 5 6 7

R1 [bits/channel use]

Figure 4.9: Bargaining trajectory; stat. CSI, SNR = 10 dB

Instantaneous CSI; SNR = 0 dB, n = 5, µi = 0.8

Region-Boundary

TDMA-Boundary

Algo-Trajectory Algo-Steps Algo-Solution

▇▇▇▇-Equilibrium

Zero-Forcing Sum-Rate

▇▇▇▇-Bargaining

R2 [bits/channel use]

00 1 2

R1 [bits/channel use]

Figure 4.10: Bargaining trajectory; inst. CSI, SNR = 0 dB

Instantaneous CSI; SNR = 10 dB, n = 5, µi = 0.8

Region-Boundary

TDMA-Boundary

Algo-Trajectory Algo-Steps Algo-Solution

▇▇▇▇-Equilibrium

Zero-Forcing Sum-Rate

▇▇▇▇-Bargaining

R2 [bits/channel use]

00 1

2 3 4 5 6

R1 [bits/channel use]

Figure 4.11: Bargaining trajectory; inst. CSI, SNR = 10 dB

Absolute SAPHYRE Gain, µi = 0, 100 MC

Rate Difference [bits/channel use]

Stat. CSI (full-rank)

10 Stat. CSI (low-rank) Inst. CSI

-2

-4

-10 -5 0

5 10

SNR [dB]

15 20 25 30

Figure 4.12: Absolute SAPHYRE Gain

Fractional SAPHYRE Gain, µi = 0, 100 MC

2.5

Stat. CSI (full-rank)

Stat. CSI (low-rank) Inst. CSI

Gain

1.5

0.5

-10 -5 0 5 10 15 20 25 30

SNR [dB]

Figure 4.13: Fractional SAPHYRE Gain

4.2 Resource Optimization in Multicell OFDMA Networks

In this section, we consider the downlink of multicell orthogonal frequency-division multiple-access (OFDMA) networks and address the adaptive allocation of spec- trum, power, and rate. We assume networks with frequency reuse one and discrete- level rates. We formulate the joint allocation problem as a nonlinear mixed integer program (MIP), which is computationally intractable to solve optimally for practical problem sizes. We exploit the capability of the receivers to measure the perceived interference-plus-noise on every subcarrier and accordingly decompose the joint al- location problem into subproblems that are solved by individual base stations. Each subproblem is a linear MIP and its optimal solution can be obtained by means of standard branch-and-cut solvers, at least for small to medium problem sizes. In order to further reduce the complexity, we propose a distributed iterative algorithm that capitalizes on the subgradient method. We demonstrate the merit of the pro- posed algorithm with numerical comparisons of its performance with the solution of the individual MIPs and the iterative waterfilling algorithm.

4.2.1 System model and problem formulation

{ }

{ − }

{ } { }

We consider downlink transmission in a multicell OFDMA network with a set L ¾ i : i = 1, . . . , L of BSs and a set K ¾ k : k = 1, . . . , K of receivers, where every BS is assumed to serve the same number of receivers, i.e. K/L. The ith BS serves the receivers within the set Ki ¾ (i 1)K/L + 1, . . . , iK/L . The network bandwidth is shared by all BSs and it is divided into a set N ¾ n : n = 1, . . . , N of orthogonal subcarriers. The channel of each subcarrier is flat, since its bandwidth is chosen small enough compared to the coherence bandwidth. The number of bits loaded on each subcarrier is chosen from a finite set Q ¾ {q : q = 1, . . . , Q}.

We define the binary allocation variables xn,q, where xn,q = 1 if subcarrier n is

k k

assigned to receiver k with rate q and xn,q = 0 otherwise. To avoid intracell inter- ▇▇▇▇▇▇▇, each subcarrier can be used by at most one receiver per cell. Hence, for BSi and subcarrier n we have the constraint

Σ Σ x ≤ 1. (4.36)

n,q k

k∈Ki q=1

The sum in the left-hand side of (4.36) equals to zero when BSi does not allocate any receiver to subcarrier n.

i,k

Let Gn denote the gain of the channel between BSi and receiver k on the nth

subcarrier. The signal-to-interference-plus-noise ratio (SINR) of receiver k, served by BSi on subcarrier n with transmit power pn, is

n n

γn ¾ i,k i . (4.37)

k In(pn )

k −i

In (4.37), the interference generated by simultaneous transmissions throughout the network on subcarrier n plus the AWGN noise variance σ2 is denoted

−i

In(pn ) ¾

Σ Gn

j=1,j

pn + σ2, (4.38)

j,k

where pn ¾ [pn, . . . , pn

, pn , . . . , pn] is the vector of all interfering transmit pow-

−i 1

ers.

i−1

i+1 L

Assuming Gaussian signaling, let Tq denote the threshold that the SINR should reach to load q bits, i.e., log2(1 + γ) = q ⇔ γ = 2q − 1 ¾ Tq. When BSi decides to serve receiver k ∈ Ki with q bits on subcarrier n, i.e. xn,q = 1, then, due to (4.37),

in order to have γn = Tq the required transmit power is pn = In(pn )Tq/Gn . Due

k i k −i i,k

to (4.36), this power is given, for an arbitrary subcarrier and bit allocation, by

▇

−i

i,k

pn = Σ Σ xn,qIn(pn )Tq/Gn

k∈Ki q=1

. (4.39)

Moreover, we assume that the total transmit power of every BS cannot exceed the

maximum budget P , i.e.

n=1

pn ≤ P .

The objective is to maximize the achievable sum-rate in the network, i.e., the sum of bit rates of all subcarriers over all cells, subject to the aforementioned constraints. Consequently, the joint resource allocation problem is stated as

Σ Σ Σ Σ qx

L N Q

max

X,P

n,q k

i=1 n=1 k∈Ki q=1

(4.40)

s.t. Σ Σ xn,q ≤ 1 ∀i ∈ L, ∀n ∈ N , (4.41a)

k∈Ki q=1

pn = Σ Σ xn,qIn(pn )Tq/Gn

∀i ∈ L, ∀n ∈ N , (4.41b)

i k

k∈Ki q=1

k −i

i,k

Σ pn ≤ P ∀i ∈ L. (4.41c)

n=1

Problem (4.40)–(4.41) is a MIP with KNQ binary allocation variables X = {xn,q ∈

n∈N, q∈Q

n n∈N

{ }} { ∈ }

0, 1 k∈K and LN continuous power variables P = pi R+ i∈L . This prob- lem is NP-hard in general [37]. The formulation is nonlinear due to the right-hand

side of (4.41b) which involves, due to (4.38), bilinear products of the optimization variables. Finding the optimal solution requires an exhaustive search with worst- case complexity exponential in the total number of variables. The complexity is prohibitive for modern broadband networks which have hundreds of subcarriers. This motivates the low-complexity distributed approach that we are proposing in Section 4.2.3.

4.2.2 Single-cell resource allocation

−i

The most significant challenge in the solution of problem (4.40)–(4.41) is due to the interference-plus-noise terms {In(pn )} in (4.41b) that couple the resource allocation

performed in different cells. However, the fact that each receiver is able to sense

and measure the interference-plus-noise on subcarriers motivates us to decompose the global problem into subproblems solved by individual BSs. In other words,

▇

▇∈Ki

BSi takes as input the values {In}n∈N collecting them from the receivers in its

cell, when the other BSs have already performed the resource allocation. Hence,

the coupling among the resource allocation problems in different cells is eliminated. Consequently, the joint problem (4.40)–(4.41) decouples into L sub-problems, each solved separately by a different BS. The problem corresponding to BSi is

Σ Σ Σ qx

N Q

max

Xi,Pi

n,q k

n=1 k∈Ki q=1

(4.42)

Σ Σs.t. x ≤ 1 ∀n ∈ N , (4.43a)

n,q k

k∈Ki q=1

pn = Σ Σ xn,qInTq/Gn

∀n ∈ N , (4.43b)

i k k

k∈Ki q=1

i,k

Σ pn ≤ P. (4.43c)

n=1

Problem (4.42)–(4.43) is a MIP with KNQ/L binary variables Xi = {xn,q ∈

n∈N, q∈Q

n n∈N

{0, 1}}k∈Ki and N continuous variables Pi = {pi ∈ R+} . Not only this

problem has L times smaller dimension than the joint one, but also it is linear,

{ }

since the constraints (4.43b) have now, for given In , become linear. There ex- ist several solvers, implementing branch-and-cut techniques, that find the optimal solution of linear MIP problems frequently avoiding exhaustive search. However, the worst-case complexity of these techniques still increases exponentially with the number of variables and becomes impractical for large problem sizes, motivating us to investigate low-complexity solutions to (4.42)–(4.43). Due to the binary ▇▇▇▇- ▇▇▇▇▇, this problem is non-convex. Existing solutions to this problem of single-cell resource allocation typically exploits the relaxation of binary variables so that the problem can be solved using convex linear programming [38]. The disadvantage is that rounding off the variables into binary ones takes the solution far from the opti- mal solution. Herein, we take advantage of the seminal contribution on multicarrier systems in [39], which has shown that, using dual optimization, the duality gap decreases as the number of subcarriers increases. The large number of subcarriers in practical OFDMA networks therefore motivates us to solve (4.42)–(4.43) in the dual domain.

▇

▇∈Ki

In the following, we focus on the resource allocation problem in the ith cell, assuming that the allocation has been already performed in the other cells, i.e. for some given {In}n∈N . Inspecting (4.43b), we observe that the transmit powers Pi depend

entirely on the variables ▇▇, provided that they also meet the bound (4.43c). Hence,

substituting (4.43b) into (4.43c), we can rewrite problem (4.42)–(4.43), with respect to only the optimization variables Xi, as

Σ Σ Σ qx

N Q

max

n,q k

n=1 k∈Ki q=1

(4.44)

s.t. Σ Σ xn,q ≤ 1 ∀n ∈ N , (4.45a)

k∈Ki q=1

N Q

▇

i,k

Σ Σ Σ xn,qInTq/Gn

n=1 k∈Ki q=1

≤ P. (4.45b)

This enables us to solve the MIP (4.42)–(4.43) in two steps. First, we solve the linear binary problem (4.44)–(4.45) to determine the subcarrier and bit level allocation, and then plug the solution into (4.43b) to compute the transmit powers.

The solution to (4.44)–(4.45) would be straightforward if we decouple the power budget constraint in (4.45b) and perform the optimization per subcarrier. This motivates the incorporation of (4.45b) into the objective function and form a La- grangian function as

N Q

Li(Xi, λi) = Σ Σ Σ qxn,q

(4.46)

.ΣN

− λi

n=1 k∈Ki q=1

n,q

Σ Σ Σ

xk Ik Tq/Gi,k − P

n=1 k∈Ki q=1

and the corresponding dual function as

{ }

Di(λi) = sup Li(Xi, λi) : (4.45a) , (4.47)

where λi is the Lagrange multiplier. This multiplier is obtained in the dual domain for a given Xi by solving the corresponding dual problem

min

λi≥0

Di(λi). (4.48)

This problem can be solved by the subgradient method, i.e., beginning with an initial λi(0), given λi(t) at iteration t, we obtain Pi from Xi using (4.43b). We then update the Lagrange multiplier as

λi(t + 1) =

Σλi(t) − α

.P −

Σn=1

ΣΣ+

, (4.49)

where P −

n=1

pn is the subgradient of Di(λi) with respect to λi and α is a step size

that should be small enough to ensure the convergence [40]. The aforementioned

approach therefore enables individual BSs to contribute the solution of the original problem separately.

To evaluate Di(λi) for a given λi in (4.47), BSi substitutes ▇▇(Xi, λi) in (4.47) with (4.46) and forms an optimization problem represented by

N Q

max

Σ Σ Σ xn,qfn,q

(4.50)

Xi k k

n=1 k∈Ki q=1

s.t. Σ Σ xn,q ≤ 1 ∀n ∈ N , (4.51)

k∈Ki q=1

▇

i,k

where fn,q ¾ q − λiInTq/Gn . Due to (4.51), each subcarrier can be used by at most

one receiver, with a single bit rate, within cell i. This statement along with the

decomposable form of (4.50)–(4.51) enables separate allocation for each individual subcarrier. The solution is therefore obtained by assigning each subcarrier n to receiver kn ∈ Ki with bit rate qn as

(kn, qn) = arg max

(k,q):k∈Ki,q∈Q

n,q k

(4.52)

provided that fn,qn

> 0. In other words, for each subcarrier in cell i, we go over

the QK/L possible receiver-bit assignments and select the one giving the largest

positive value. Hence, xn,q

= 1 if k = kn and q = qn, otherwise xn,q

= 0. Due

to (4.43b), the transmit power is pn = InTq/Gn in the former case and pn = 0 in

i k i,k i

the latter case. However when fn,qn ≤ 0, then xn,q = 0 for all k ∈ Ki, q ∈ Q, and

kn k

accordingly pn = 0.

4.2.3 Distributed resource allocation algorithm

Given the solution for the allocation problem of each BS, presented in Section 4.2.2, in the sequel we propose a distributed subcarrier, power, and bit level (DSPB) allocation algorithm for the downlink of multicell OFDMA networks. The DSPB algorithm is based on the iterative update of the Lagrange multiplier in (4.49). We assume that there is a network coordinator, which synchronizes the BSs so that they know their order in the algorithm. This coordinator, also, terminates the algorithm upon satisfaction of the convergence condition. During the algorithm iterations, the channel gains are assumed to be constant. In addition, there is a mechanism to feedback, for all subcarriers, the channel gains and perceived interference from all receivers in each cell to the corresponding BS.

At first, every BS initializes the Lagrange multiplier and distributes uniformly a part of the power budget on all subcarriers (step 1, where δ < 1). The network co- ordinator continues the iterations till the aggregate differential power in the network

Algorithm 3 Distributed Subcarrier, Power, and Bit level allocation (DSPB)

1: Initialization: t = 0, λi(0) = λinit ∀i ∈ L, pn = δP/N ∀i ∈ L, ∀n ∈ N

2: while

Σ| Σ

pn − P | ≥ ǫ do

i∈L n∈N

3: t = t + 1

k∈Kj

4: Network coordinator chooses BSj in a round-▇▇▇▇▇ order.

−j

5: BSj measures {In(pn

)}n∈N .

6: BSj determines Xj and Pj using (4.52) and (4.43b) respectively.

7: BSj updates λj(t) using (4.49).

8: end while

would be less than an accuracy threshold ǫ (step 2). This condition characterizes the satisfaction of the power constraints (4.43c). At each iteration, a BS is chosen in a round-▇▇▇▇▇ manner to update its subcarrier, power, and bit level allocation subject to the measured interference from the other BSs (steps 4, 5, and 6). Using the new power settings, the chosen BS updates its Lagrange multiplier (step 7).

The DSPB algorithm takes advantage of two decomposition levels to overcome the exponential complexity of exhaustive search methods over the NQK binary ▇▇▇▇- ▇▇▇▇▇. First, decoupling the original resource allocation problem (4.40)–(4.41) into subproblems, we decrease the exponential complexity to be linear in L. The lin- earity is due to the Lagrange multiplier update in (4.49). Second, the complex- ity O((QK/L)N ) of subcarrier and rate allocation within each cell is decreased to O(NQK/L) by dual decomposition in Section 4.2.2, as we came up with an opti- mization per subcarrier. In overall, the complexity is O(NQK), linear in the number of subcarriers, bit levels, and users. On the other hand, the algorithm burdens some signalling overhead. The network coordinator notifies the BSs of their order in the algorithm and finally terminates the algorithm. At the end of every iteration, the chosen BS has to send to the coordinator its updated aggregate transmit power.

4.2.4 Performance evaluation

We consider downlink transmission in a network with four cells of radius R = 1 Km and 8 users. Every BS, located at the center of the corresponding cell, serves 2 users, randomly placed within the cell. The path loss (in dB) at a distance d from a BS is given by L(d) = L(d0) + 10α log10(d/d0), where for the reference point it is d0 = 50 m, L(50) = 0, and the path loss exponent is α = 3.5. The shadowing effect is modeled as an independent log-normal random variable with 8 dB standard deviation. The channel on each link is assumed to be ▇▇▇▇▇▇▇▇ ▇▇▇▇▇▇, modeled by a six-tap impulse response with exponential power delay profile indicated by ge−(l−1), where g = 1 is the first path’s average power gain and l is the path index. Moreover, the root-mean-square delay spread is 0.9 µs. The transmission budget of each BS is P = 5 W and the noise variance is assumed to be σ2 = −90 dBm for all receivers.

The bit level on each subcarrier is chosen from the set Q = {1, 2, ..., 5}, so that the corresponding SINR thresholds are Tq = {1, 3, 7, 15, 31}, respectively.

Firstly, to investigate the performance of DSPB for a typical number of subcarriers,

e.g. N = 64, we show in Figure 4.14 the sum-rate achievement (in bits per OFDM symbol) of each cell versus the iteration number. It is seen that with the convergence of the transmit powers (ǫ = 0.1), the cell sum rates attain their final values.

In the following, we compare, in the aforementioned setup, the performance of DSPB with the result obtained by solving the individual MIP (IMIP) (4.42)–(4.43) at individual BSs. The optimal solution in the primal domain of each IMIP is obtained calling the GNU linear programming kit (GLPK)[41]. In this scheme, similar to DSPB, beginning with uniform power allocation, the individual problems at BSs are solved optimally in a round-▇▇▇▇▇ manner. As a lower bound, we also include the sum-rate values achieved from the iterative waterfilling algorithm (IWF) [42, 43] customized to OFDMA systems using joint subcarrier and power allocation as in [44] and [45]. Since the subcarrier rates in IWF are assumed to be continuous, we round off each achievable rate to the largest integer value not greater than that rate. We compare the overall sum-rate of the aforementioned schemes with different number of subcarriers, i.e. N . For each value of N , we obtain the sum rates for 50 realizations of the fading channel gains and show the average sum rates in Figure 4.16. We observe that DSPB outperforms both IMIP and IWF schemes. The performance gap between DSPB and IWF becomes larger as the number of subcarriers increases. This is due to the degradation effect of the rounding operation in IWF which increases with the number of subcarriers.

The performance difference between IMIP and DSPB is due to the fact that, in IMIP, each BS adopts the optimal solution in (4.42)–(4.43) to maximize its own sum rate. This optimal strategy most likely generates a large interference and therefore degrades the performance of other BSs significantly. However, in DSPB,

▇

each BS assigns each subcarrier as in (4.52), where fn,q can be written as fn,q =

−

(q λipn). In other words, in addition to the achieved rate q, DSPB also takes the required transmit power pn into account in subcarrier allocation via the Lagrange multiplier acting as power price. Apparently, DSPB tends to minimize the generated interference on the other cells and therefore they undergo small rate degradation at the last iteration.

cell

160

DSPB sum−rate (bit per symbol)

140

120

100

5 10 15 20 25 30

Iteration (t)

Figure 4.14: Sum-rate variation in DSPB

cell

Transmission power (W)

5 10 15 20 25 30

Iteration (t)

Figure 4.15: Power variation in DSPB

700

600

Sum−rate (bit per symbol)

500

400

300

200

100

DSPB IMIP IWF

20 40 60 80 100 120

# subcarriers (N)

Figure 4.16: Average sum-rate in DSPB, IMIP, and IWF

5 Relay Assisted Infrastructure Sharing and Spectrum Sharing

5.1 Interference Neutralization

In this Section, we focus on a multi-user wireless network equipped with multiple relay nodes where some relays are more intelligent than other relay nodes. The intelligent relays are able to gather channel state information, perform linear pro- cessing and forward signals whereas the dumb relays is only able to serve as am- plifiers. As the dumb relays are oblivious to the source and destination nodes, the wireless network can be modeled as a relay network with smart instantaneous relay only: the signals of the source-destination link arrive at the same time as the source-relay-destination link. Recently, instantaneous relaying is shown to im- prove the degrees-of-freedom of the network as compared to the classical cut-set bound. In the following, we study an achievable rate region and its boundary of the instantaneous interference relay channel in the scenario of (a) uninformed non- cooperative source-destination nodes (source and destination nodes are not aware of the existence of the relay and are non-cooperative) and (b) informed and coopera- tive source-destination nodes. Further, we examine the performance of interference neutralization: a relay strategy which is able to cancel interference signals at each destination node in the air. We observe that interference neutralization, although promised to achieve desired degrees-of-freedom, may not be feasible if the relay has limited power. Simulation results show that the optimal relay strategies improve the achievable rate region and provide better user-fairness in both uninformed non- cooperative and informed cooperative scenarios.

∈

In a SISO interference relay channel (IRC), an example of two sources and two destinations shown in Figure 5.1, we denote the sources as Si and destinations as Di, i = 1, . . . , K. The multi-antenna relay node is denoted as R. Denote the complex channel from Si to Dj as hji and the complex channel vector from Si to R as gri and from R to Dj as gjr. All channels are assumed to be independent identically distributed complex Gaussian variates, gri, gjr CM×1, where M is the number of antennas at the relay. We assume a memoryless instantaneous relay [46, 47, 48] which has a linear processing matrix R CM×M . The signal received at R is:

yr = grjxj + nr (5.1)

j=1

where xj are the transmit symbols from Sj which is assumed to be zero mean

proper Gaussian variable and has power constraint Pmax, E [|xj|2] = Pj ≤ Pmax, j =

1, . . . , K. The noise at the relay is denoted as nr which is assumed to be independent

identically distributed proper Gaussian variables with zero mean and unit variance. The assumption of circularity for the transmit symbols simplifies the derivation as the achievable rate is the ▇▇▇▇▇▇▇ rate. For the degrees-of-freedom and achiev- able rates improvement with improper Gaussian signaling, please refer to [49, 50] respectively. The received signal at Dj, j = 1, . . . , K, is

yj = Σ .hjl + gHRgrlΣ xl + gHRnr + nj. (5.2)

l=1

For brevity, denote p = [P1, . . . , PK]T ∈ RK×1. The Signal-to-Interference-and-

Noise ratio at destination j is

H 2

SINR (R, p) = |hjj + gjrRgrj | Pj

(5.3)

j K

l=1,l

j |hjl

+ gHRgrl

|2Pl

+ ǁgHRǁ2 + 1

where ǁgHRǁ2 is the amplified noise from R to Dj. The power constraint at the

relay is

Ey Σtr .RyryHRHΣΣ ≤ Pmax. (5.4)

j=1

Note that En ,x ,j=1,...,K ΣyryH Σ = ΣK

grjgHPj + I. The power constraint is there-

fore rewritten as the following:

tr .RQRHΣ ≤ Pmax

(5.5)

j=1

where Q = ΣK grjgHPj + I.

Interference neutralization

−

In order to neutralize interference, the following K(K 1) equations have to be satisfied at the same time:

hij + gHRgrj = 0, i, j = 1, . . . , K, i ƒ= j. (5.6) Let Gdr = [g1r, g2r, . . . , gKr] and Grt = [gr1, gr2, . . . , grK]. Denote sl = gHRgrl.

We have the interference neutralization requirement,

GH RGrt = S (5.7)

with

 s1 . . . −h1K 

S =



−h21 s2 . . .





 . .

. (5.8)

. . . −hK(K−1) sK

gr1

g1r

h11

S1 repeater D1

grK

hK1

h1K

gKr

repeater

gr1

g1r

h11

grK

h1K

gKr

hK1

hKK

(a) a relay-assisted net- (b) instantaneous relay net-

work work

Figure 5.1: The wireless relay-assisted network with layer one repeaters and one smart relay is shown in subfigure (a). The dotted lines demonstrate the equivalent links between source the destination taking into account of the presence of the repeaters. All paths from source to destination nodes take two time slots and links from source to relay and relay to destination take one time slot. The equivalent channel is established in subbfigure (b) by replacing the relay as an instantaneous relay and information through instantaneous relay arrive at destinations the same time as the direct links.

Note that S is a matrix with off-diagonal elements as the channel coefficients of the interference channel and diagonal elements as the optimization variables s. As we sill show later, the optimization can be facilitated if the optimization variable is S instead of s. This is due to the linear relationship between R and S. However, we must stress that only the diagonal elements of S, s, are free to be optimized as S is constrained to the form in (5.8). Eqt. (5.8) can be rewritten to a more comprehensive form. We introduce a row selection matrix T which select the off- diagonal elements of S from the vector vec(S). For example when K = 2, we have T = [0, 1, 0, 0; 0, 0, 1, 0] and T vec(S) = [−h21, −h12]T . We have

T vec(S) = −T vec(H). (5.9)

If IN is feasible (see later in Theorem Theorem 5.1 for feasibility issue), we can choose a relay matrix R, which is a function of complex coefficients s, that satisfies the IN constraint (5.7) and achieves the following SINR,

H 2

j j

SINRIN(S, P ) = |hjj + gjrR(S)grj | Pj . (5.10)

ǁgHR(S)ǁ2 + 1

In the following, we use the matrices R and R(S) interchangeably when we wish to emphasize the optimization parameter S. Please see the next lemma for the formulation of R(S). In order to make sure the requirements in (5.7) are not over-

determined, we find the minimum number of antennas required in the relay such that (5.7) is feasible.

≥ ∈

Lemma 5.1. A sufficient condition of IN in (5.7) on the minimum number of an- tennas at the relay is M K. For some target signal coefficients s1, . . . , sK C, the relay processing matrix R that satisfies the interference neutralization requirement (5.7) is determined by

vec(R) = .GT ⊗ GH Σ† vec(S) (5.11)

where GT ⊗ Gdr is assumed to have full row rank1, rank .GT ⊗ GH Σ = K2.

Note that when M < K, the system in (5.7) has more equations than unknowns and the consistency of the system depends heavily on the particular channel realizations. For M ≥ K, we write vec(R) as

vec(R) = .GT ⊗ GH Σ† vec(S) + b (5.12)

where b ∈ CM2 is in the null space of GT ⊗ G . With (5.12), one must optimize

both b and ▇▇▇(S) in order to optimize vec(R). For simplicity of presentation, from

now on, we set M = K and obtain [51, P.35]

vec(R) = .GT ⊗ GH Σ−1 vec(S). (5.13)

If there is no power constraint at the relay, given any target signal coefficients si, we can construct a relay matrix R as in (5.13) such that the IN requirement is satisfied. Otherwise, any performance metrics, e.g. sum rate, subject to the IN and power constraints, are optimized in a dimension of K by optimizing the complex coefficients s = [s1, . . . , sK]T .

5.1.1 The Pareto boundary optimization problem

The achievable rate region of an instantaneous AF relay is defined to be the set of rate tuple achieved by all possible relay processing matrix R satisfying the power constraint (5.5):

[

R =

R:tr(RQRH )≤Pr

(C(SINR1(R)), . . . , C(▇▇▇▇▇(R))) (5.14)

where C(x) = log2(1 + x). Similarly, we define the achievable rate region of an instantaneous AF relay with IN to be the set of rate tuple achieved by all pos- sible relay processing matrix R satisfying the IN constraint (5.13) and the power

1The assumption of full column rank is not difficult to fulfill as the elements in the channel matrix are statistically independent.

[

constraint (5.5):

RIN =

R:tr(RQRH )≤Pr,

(C(SINR1(R)), . . . , C(▇▇▇▇▇(R))) . (5.15)

vec(R)=(GT ⊗GH

−1

vec(S),

rt dr )

T vec(S)=−T vec(H)

⊂

Note that the IN requirement gives a smaller feasible set and thus RIN R. How- ever, the motivation of the study of ▇▇▇ is two-fold: (a) intuitively when both transmit power at S and R is very large, the optimal relay strategy is to neutralize interference, so as to transmit at the maximum DOF; but the performance of other SNR regimes is yet to be studied. (b) the characterization of RIN in the instanta- neous IRC with direct link between S-D pairs is still an open problem. The outer boundary of R (RIN) – the Pareto boundary (PB) of R (RIN) – is a set of operat- ing points at which one user cannot increase its own rate without simultaneously decreasing other users rate.

≥ ƒ

Definition 5.1 ([52, 53]). A rate-tuple (r1, . . . , rK) is Pareto optimal if there is no other rate tuple (q1, . . . , qK) such that (q1, . . . , qK) (r1, . . . , rK) and (q1, . . . , qK) =(r1, . . . , rK)2.

By definition, the operation points on the PB can be evaluated by maximizing one user’s rate while keeping other users’ rates constant. Other optimization techniques for PB evaluation has been proposed [54, 55]. Here, the PB problem is formulated as the maximization of SINR1 subject to the constraints on SINRj ≥ γj for some pre-determined target SINR values γj, j = 2, . . . , K.

Problem Statement 1. The Pareto boundary of R (5.14) is a set of rate tuple

C(γ ), C(γ ), . . . , C(γ ) where γ is the optimal objective value and γ , j =

2, . . . K are the constraints of the following optimization problem.

(PB)

 R∈CM ,p∈R+

max

K×1

SINR1(R, p)

(5.16)



s.t.

tr RQR ≤ P .

SINRj(R, p) ≥ γj, j = 2, . . . , K,

max

. Σ

Similarly, we formulate the PB optimization problem with IN in the following. To this end, we combine the first two constraints in the feasibility set of (5.15). Using the fact that tr (A B CD) = vec(DT )T CT ⊗ A vec(B) [51], we have

tr(RQRH) = vec(R)H .QT ⊗ IΣ vec(R)

= vec(S)

vec(S)

s ˛¸ x

diag(p) + Grt Grt

⊗ Gdr Gdr

(a)

H ..

−∗ −T Σ

−1 −H Σ

(5.17)

Q˜

2The inequality is component-wise.

where (a) is due to (5.13) and G−∗QT G−T = diag(p) + G−∗G−T .

rt rt rt rt

Problem Statement 2. The Pareto boundary of RIN (5.15) is a set of rate tuple

.C(γ#), C(γ2), . . . , C(γK)Σ where γ# is the optimal objective value and γj, j =

2, . . . K are the constraints of the following optimization problem.

(PB IN) : max

S∈CK,p∈R+

▇▇▇▇▇▇(S, P1) (5.18a)

− K×1

s.t. ▇▇▇▇▇▇(S, Pj) ≥ γj, j = 2, . . . , K, (5.18b) vec(S)HQ˜ vec(S) ≤ Pmax, (5.18c)

T vec(S) = −T vec(H). (5.18d)

(5.18)

Before we show the optimization methods of the aforementioned problems, the feasibility issue of (5.18a) needs to be addressed.

Theorem 5.1. A necessary and sufficient condition for the feasibility of interference neutralization - satisfying (5.18c) and (5.18d) simultaneously - is

(T vec(H))H .T Q˜−1T HΣ−1 T vec(H) ≤ Pmax, (5.19)

where Q˜ = .diag(p) + G−∗G−T Σ ⊗ .G−1G−HΣ. A feasible solution is

vec(S) = F .xn + (TF )†T vec(H)Σ (5.20)

where xn ∈ N(TF ) and Q˜−1 = FF H.

Note that the left hand side of the condition (5.19) is the power of the relay ma- trix which only performs IN (with xn = 0 in (5.20)) and only depends on channel coefficients whereas the right hand side is the relay power constraint. The impor- tance of ▇▇▇▇▇▇▇ ▇▇▇▇▇▇▇ 5.1 lies in the practicality of IN feasibility verification. With the information of channel qualities H, Grt, Gdr, transmit power at S p and transmit power constraint at relay Pmax, we can immediately check whether IN is feasible or not. Further, if IN is feasible, there is always one feasible solution in

(5.20) by setting xn = 0K×1; if there is excess power at relay, we can optimize xn, and in turn vec(S), to improve system performance, which is the objective of the

succeeding section.

5.1.2 The optimal relay strategies with uninformed S-D nodes

In this section, we analyze the PB problems taking into consideration that the S and D nodes are uninformed of the presence of the relay nodes and therefore do not optimize their transmit power values: p = p0 for some pre-determined power values p0. If the S-D pairs are non-cooperative, then each S transmits at full power and

thus p = 1 Pmax. Given the transmit power values of the source nodes, we maximize the achievable rate by choosing the relax matrix. The PB problems are formulated into quadratically constrained quadratic programs (QCQP) and are then relaxed to semi-definite programs (SDP). The relaxed problems are convex and can be solved using efficient convex optimization tools such as CVX [56]. We show that in some scenarios, the optimal solution of the relaxed problems attains the optimality of the original problems. In other words, the convex optimization methods solve the original problem efficiently in such scenarios (Please refer to Corollary Remark 5.1 and Lemma Lemma 5.4 for details). The procedures used to obtain the PB are summarized in Algorithm 4.

Algorithm 4 The pseudo-code for the Pareto boundary optimization in (5.16) and (5.18a).

1: for j = 2 → K do

2: Compute the single-user-point of user j: γmax

| hjj +gH Rgrj |2Pj

ǁgH Rǁ2+1

= maxR , in

which the user j is the only user in the system with no interference.

4: end for

N −1

▇▇▇ ▇

For a predefined integer N , let V = ,0, γmax , 2γmax , . . . , γ ,.

5: Define a tuple s to be a vector of possible values of γj, s = [γ2, . . . , γK] and

∈ × × · · · ×

s S = V2 V3 VK.

∈

6: for each s S do

7: With input parameter s, matrix S is defined as in (5.8). Solve the optimiza- tion problems (5.22) for general relay processing matrix optimization or (5.24) for relay processing matrix optimization with interference neutralization.

8: if the optimization problem is feasible then

−

9: the optimal value γ† and s form a point on the Pareto boundary, which is a K 1 dimension hyper-surface, in the K dimensional space.

10: else

11: the values of s are unachievable subject to the constraints.

12: end if

13: end for

The Pareto boundary: general relay optimization

By introducing an auxiliary variable, we formulate the PB optimization problem with general relay processing matrix as a QCQP. The optimization variable is of dimension M 2 + 1 as compared to M 2 number of elements in the relay matrix. Nevertheless, this provides a more structural formulation and amends the analysis as shown in the sequel.

Lemma 5.2. The Pareto boundary of an IRC with instantaneous AF relay (5.16)

is a rate tuple (C(γ#), C(γ ), . . . , C(γ

)) in which γ# =

v¯H X11v¯

and v¯

is the op-

1 2 K

1 v¯H X12v¯

timal solution of the following optimization problem. The values γj, j = 2, . . . , K

contribute to the constraints of the optimization problem.



max

v∈C(M 2+1)×1

vHX11v vH X12v

vHXj1v

(5.21)

s.t.



vH Xj2

v ≥ γj, j = 2, . . . , K,

vHX3v ≤ 0.

The matrices, for i = 1, . . . , K, are given by

Xi1 =

Pi,

Σ (gri ⊗ g∗ )∗ (gri ⊗ g∗ )T (gri ⊗ g∗ )T hii Σ

h∗ (gri ⊗ g∗ ) |hii|

ii ir

Σ Σ

(grl ⊗ gir)

∗

Σ Σ ∗ ∗ ∗ T

hil (grl ⊗ gir)

(grl ⊗ gir)

hil

IM ⊗ (girgir ) 0M 2×1

∗ T Σ Σ H Σ

|hil|

01×M 2 1

X3 =

QT ⊗ I 0M2×1

01×M2 −Pr

Define V = vvH, we obtain the following convex problem after removing the con- straint rank(V ) = 1:



V ∈C(M 2+1),V ≤0



 max tr (X11V )

s.t. tr (X12V ) = 1,



tr ((Xj1 − γjXj2) V ) ≥ 0, j = 2, . . . , K,

tr (X3V ) ≤ 0

(5.22)

which is a semi-definite program (SDP) as matrices Xi1, Xi2, X3 are Hermitian and can be solved efficiently using SDP solvers.

Remark 5.1.√By [57, Theorem 3.2] the rank of the optimum solution of (5.22) is

smaller than K + 1. In the scenario of K = 2 S-D pairs, the rank of the optimal

solution is one which means that the relaxation is tight and the obtained solution is the global optimal solution of (5.21). In occasions when the optimal solution returned by CVX is not rank-one, one can find a vector which satisfies the same constraint and objective values in (5.22) using the rank-one reduction procedures [58, Theorem 2.3]. Such vector is thus one of the global optimal solutions of (5.21). For more S-D pairs, the result of SDP in (5.22) is not rank-one. However, one can apply randomization approximation techniques [59] to approximate the optimal solution of (5.21).

The Pareto boundary with IN

We manipulate the problem (5.18a) in the same fashion as in (5.21). The opti- mization problem (5.18a) has one more constraint (the IN constraint) and is thus optimized in a smaller dimension K2 + 1 rather than M 2 + 1. Note that the con- dition given in Theorem Theorem 5.1 is a necessary and sufficient condition for the feasibility of IN (non-empty feasible set in (5.15)). In other words, if the condition is not satisfied, then the optimization problem in (5.23) is not feasible regardless of the target SINR values γ2, . . . , γK.

Lemma 5.3. The Pareto boundary of an IRC with instantaneous AF relay and IN

(5.18a) is a rate tuple (C(γ#), C(γ2), . . . , C(γK)) in which γ# = y¯HBˆ i1y¯ and y¯ is the

1 1

optimal solution of the following optimization problem. The values γj, j = 2, . . . , K

contribute to the constraints of the optimization problem.



y∈C(K2+1)×1



 max yH Bˆ11 y

where

s.t. yH Bˆ12 y = 1,



yH Bˆ j y ≥ 0, j = 2, . . . , K,

 yH Dˆ3 y ≤ 0, yH Dˆ4 y = 0.

(5.23)

B11 =

ˆ Σ LT e1eT L LT e1h11 Σ

∗

ˆ Σ G−∗G−T ⊗ e1eT

0K2×1 Σ

h11e1 L |h11|

01×K2 1

P1,

B12 =

Bj =

∗

, D3 =

ˆ Σ LT ejeT L − γj .G−∗G−T ⊗ ejeT Σ LT ej hjj Σ ˆ

Σ Q˜ 0K2×1 Σ

max

hjj ej L |hjj| − γj

01×K2 −Pr

Dˆ4 =

T HT T HT vec(H)

vec(H)HT HT vec(H)T HT vec(H) .

Let Y = y yH. Using SDR techniques, the problem in (5.23) can be relaxed to the following problem by dropping the rank one constraint on Y . Problem (5.24) is a convex problem, in particular a semi-definite program, which can be solved efficiently.



Y ∈CK2+1,Y ≤0

 max tr .Bˆ11Y Σ



s.t. tr .Bˆ12Y Σ = 1,

tr .Bˆ j Y Σ ≥ 0, j = 2, . . . , K,

 tr .Dˆ3Y Σ ≤ 0, tr .Dˆ4Y Σ = 0.

(5.24)

Note that, if the optimal solution in (5.24) is rank one, then such solution solves (5.23) optimally. If the optimal solution in (5.24) is not rank one, then the optimality

of (5.23) can no longer be guaranteed. In the following, we characterize the rank of the optimal solution of (5.24).

Lemma 5.4. The optimal solution of K-user IRC Pareto boundary problem with

IN in (5.24), Y˜ , satisfies

rank(Y˜ ) ≤

√

K + 1. (5.25)

The optimization methods and results for (5.24) are similar to Corollary Remark 5.1 and shall not be repeated here. In the following section, we provide numerical evi- dence of performance gain of a relay introduction to a SISO interference channel. In particular, in a setting of uninformed source nodes, we show the rate improvement of solely introducing and optimizing the relay strategy whereas in a setting of informed S-D nodes, we compare the rate performance of the general relay optimization and the relay optimization with IN to the rate region of a SISO IC.

5.1.3 Simulation results

For illustrative purposes, we let K = M = 2. We assume that each element in the channel matrices H, Grt, Gdr is an independent identically distributed complex Gaussian variable with zero mean and unit variance. In Section 5.1.3, we simulate the achievable rates of the SISO IC (marked as squares). For comparison, we sim- ulate the achievable rates of the fully connected IRC with the same S-D nodes, by introducing a relay, equipped with 2 antennas, to the aforementioned IC, and the relay can choose to enforce IN (marked with asterisks) or not (marked with trian- gles). We show that optimized relay strategies improve achievable rate regions. In Section 5.1.3 and 5.1.3, we compare the average sum rate and proportional fairness utility achieved by optimized relay strategies and by power allocation on the IC respectively. In Section 5.1.3, we illustrate the sum rate performance and propor- tional fairness utility when the transmit power constraint at source nodes and relay nodes vary.

Rate region improvement

In Figure 5.2, we plot the achievable rate region of a two-user SISO IC with transmit power constraint at each source node Pmax = 10 dB. Introducing an instantaneous relay, equipped with 2 antennas, we obtain an IRC. We set the relay power con- straint as Pmax = 20 dB. The achievable rates achieved by general relay strategies and IN outperform the IC case. The black arrows originate from the ▇▇▇▇ Equi- librium point: the rate points in which both users transmit with full power. The north-east side of the arrows mark the rate region improved by the relay, in the scenario of uninformed source nodes. This validates our intuition that optimized relay strategies can improve achievable rates of the system even if the source nodes are oblivious to the existence of the relay and do not change their transmit power.

Further note that the single user points achieved in IRC with general relay ▇▇▇▇▇▇- ▇▇▇▇ always outperform the single user points in a SISO IC. It demonstrates that the relay not only is capable of reducing interference in the system but also forwarding the desired signal to the destinations.

IRC-IN

SISO-IC

IRC general

SAPHYRE gain

R2 [bits/sec/Hz]

00 1 2 3 4 5 6

R1 [bits/sec/Hz]

Figure 5.2: The rate improvement of relay optimization on a two-user SISO IRC

with K = M = 2, Pmax

= 20 dB, Pmax

= 10 dB. The arrow marks

the increment of rate region by introducing a relay into the system and optimizing the relay strategy over the ▇▇▇▇ equilibrium point.

Average sum rate improvement

In Figure 5.3, we show the maximum sum rate achieved by general relay strategies, IN and power allocation on the IC, averaged over 100 independent channel realiza-

tions. The power constraint at the source node is assumed to be Pmax

= 10 dB

and we increase the relay transmit power from 5 dB to 25 dB. We observe that the optimized relay strategy without IN always outperform the maximum sum rate of the IC, demonstrating that an instantaneous smart relay can improve average sum rate performance. Further, we observe that the performance of IN is limited by the relay transmit power. Although IN is analytically appealing, there are limitations of the implementation of IN. Such scenarios include strong interference channels in which the receivers have strong interference from other transmitters in the system. In this case, more power at the relay may be required to completely null out inter- ▇▇▇▇▇▇▇ and if such power is not available to the relay, then IN is not feasible. On the other hand, if the strength of the interference channel is not strong, enforcing IN, the relay loses its optimization degrees of freedom and may not be able to achieve some operating points as the general relay strategies would achieve.

SAPHYRE gain

SISO-IC

IRC-general IRC-IN

sum rate [bits/sec/Hz]

05 10 15 20 25

Pmax [dB]

Figure 5.3: The average sum rate a two-user SISO IRC with K = M = 2, Pmax = 10 dB. The optimized relay strategies improve the average sum rate of the system significantly.

Proportional fairness improvement

While average sum rate is an important system performance measure, user fairness holds importance in many applications. In a 2-user system, the proportional fairness utility is defined as the area of a rectangle with one vertex as the operating point (R1, R2) and the opposite vertex as the threat point (or the ▇▇▇▇ Equilibrium point) (RNE, RNE),

1 2

.R1 − RNEΣ .R2 − RNEΣ . (5.26)

The idea is to choose an operating point on the Pareto boundary which provides good improvement for both users (or loosely speaking as far away from the threat point as possible). The proportional fairness utility achieved by a scheme with Pareto boundary P is defined as

rpf = max

.R1 − RNEΣ .R2 − RNEΣ (5.27)

(R1,R2)∈P,

R1≥RNE,R2≥RNE

where P denotes the Pareto boundary. In Figure 5.4, we illustrate the average proportional fairness utility. We observe that the optimized general relay strategies and optimized IN strategies, provide promising proportional fairness and better sum rate performance compared to IC.

SAPHYRE gain

SISO-IC

IRC-general IRC-IN

proportional fairness utility

05 10 15 20 25

Pmax [dB]

Figure 5.4: The average proportional fairness utility (R1 − RNE)(R2 − RNE) of a

two-user SISO IRC with K = M = 2, Pmax = 10 dB. The optimized

relay strategies improve fairness of the system significantly.

Performance measures in terms of transmit power constraints

It is interesting to observe that the performance of optimized relay strategies depend on both Pmax and Pmax. This is due to the amplify-and-forward nature of the relay.

s r

If the transmit power from source nodes is high, the relay can spend less power on

amplification of signals due to the relay power constraint.

In Figure 5.5, we plot the maximum sum rate achieved by optimized relay strategy minus the maximum sum rate of the SISO-IC with general relay strategy in Figure 5.5(a) and with IN in Figure 5.5(b). Note that the feasibility conditions of IN, shown in Theorem Theorem 5.1, is validated in Figure 5.5. For a fixed relay power Pmax, when the source power increases such that the conditions are violated, IN is not feasible and the achievable rate is zero and consequently the corresponding values in Figure 5.5(b) are negative. For the general relay optimization, for a fixed relay power and increasing source power, the rate performance is not always better because the interference power increases with the source power and the relay may not have enough power to manage interference and amplify desired signals simultaneously.

In Figure 5.6, we show the maximum proportional fairness utility achieved by opti- mized relay strategies and IC. When both source power and relay power are abun- ▇▇▇▇, the fairness is desirable. However, when the source power (which is also the strength of interference) is relatively stronger than the relay power, the fairness achieved by the relay strategies is overtaken by the proportional fairness utility

achieved by IC.

10 10

l)-(SISO-IC)

C-genera

(IR

sum rate [bits/sec/Hz]

Pmax [dB] 0 0

Pmax [dB]

(a) Improvement of IRC-general over SISO-IC

)-(SISO-IC)

10 10

(IRC-IN

sum rate [bits/sec/Hz]

Pmax [dB] 0 0

Pmax [dB]

(b) Improvement of IRC-IN over SISO-IC

Figure 5.5: The sum rate improvement of (a) IRC-general over SISO-IC and (b) IRC-IN over SISO-IC with a particular channel realization of a two-user SISO IRC with K = M = 2 .

5.1.4 Conclusion and future research directions

The achievable rate region of a SISO-IC has been an on-going research topic, with recent interest on the question whether a relay introduction to the SISO-IC, obtain-

SO-IC)

eral)-(SI

(IRC-gen

sum rate [bits/sec/Hz]

20 20

10 10

0 0

Pmax [dB] Pmax [dB]

(a) Improvement of IRC-general over SISO-IC

)-(SISO-I

(IRC-IN

sum rate [bits/sec/Hz]

20 20

10 10

0 0

Pmax [dB] Pmax [dB]

(b) Improvement of IRC-IN over SISO-IC

Figure 5.6: The proportional fairness improvement of (a) IRC-general over SISO-IC and (b) IRC-IN over SISO-IC of a particular channel realization of a two-user SISO IRC with K = M = 2.

ing an interference relay channel, provides any performance gain. In this paper, we study this problem by assuming an instantaneous amplify-and-forward relay with uninformed source and destination nodes in the system. We examine the gain of rate region of the relay introduction by formulating the Pareto boundary problem with optimization over relay processing matrix. The optimization problems, with and without the employment of interference neutralization techniques, are solved

SAPHYRE

Document Metadata

Document Metadata

Table of Contents