### 3.1 MBIP transformation

In this section, we first discuss a deterministic transformation of the probabilistic constraint in (P1). As assumed, the random variable \mathbf{g} takes values from a finite set \mathbf{G}=\{{\mathbf{g}}_{1},\phantom{\rule{0.5em}{0ex}}{\mathbf{g}}_{2},\phantom{\rule{0.5em}{0ex}}\dots ,\phantom{\rule{0.5em}{0ex}}{\mathbf{g}}_{\mathbf{N}}\} with a corresponding probability set {*p*_{1}, *p*_{2}, . . . , *p*_{
N
}}. We refer to each probable value {\mathbf{g}}_{\mathbf{n}} as one scenario. The probabilistic constraint can then be interpreted as that the sum probability over all possible interference-violating scenarios must be bounded by *p*_{th}. Therefore, we can reformulate the probabilistic constraint in (P1) as shown in the following problem:

\begin{array}{ll}(\text{P}2):\underset{{\mathbf{K}}_{\mathbf{x}},{b}_{n}}{\text{maximize}}:{\mathbf{h}}^{\u2020}{\mathbf{K}}_{\mathbf{x}}\mathbf{h}\hfill & (7)\hfill \\ \phantom{\rule{0.5em}{0ex}}\phantom{\rule{0.5em}{0ex}}\phantom{\rule{0.5em}{0ex}}\text{subject}\phantom{\rule{0.5em}{0ex}}\text{to}:\text{tr}({\mathbf{K}}_{\mathbf{x}})\le {P}_{\text{tr}1}\hfill & (8)\hfill \\ \phantom{\rule{0.5em}{0ex}}\phantom{\rule{0.5em}{0ex}}\phantom{\rule{0.5em}{0ex}}\phantom{\rule{0.5em}{0ex}}\phantom{\rule{0.5em}{0ex}}\phantom{\rule{0.5em}{0ex}}\phantom{\rule{0.5em}{0ex}}E[{\mathbf{g}}^{\u2020}{\mathbf{K}}_{\mathbf{x}}\mathbf{g}]\le {P}_{\text{tr}2}\hfill & (9)\hfill \\ \phantom{\rule{0.5em}{0ex}}\phantom{\rule{0.5em}{0ex}}\phantom{\rule{0.5em}{0ex}}\phantom{\rule{0.5em}{0ex}}\phantom{\rule{0.5em}{0ex}}\phantom{\rule{0.5em}{0ex}}\phantom{\rule{0.5em}{0ex}}{\mathbf{g}}_{\mathbf{n}}^{\u2020}{\mathbf{K}}_{\mathbf{x}}{\mathbf{g}}_{\mathbf{n}}-M{b}_{n}\le r,n=1,\dots ,\phantom{\rule{0.5em}{0ex}}N.\hfill & (10)\hfill \\ \phantom{\rule{0.5em}{0ex}}\phantom{\rule{0.5em}{0ex}}\phantom{\rule{0.5em}{0ex}}\phantom{\rule{0.5em}{0ex}}\phantom{\rule{0.5em}{0ex}}\phantom{\rule{0.5em}{0ex}}\phantom{\rule{0.5em}{0ex}}{\displaystyle \sum _{n=1}^{N}{b}_{n}{p}_{n}\le {p}_{\text{th}},{b}_{n}\in \{0,1\}}\hfill & (11)\hfill \\ \phantom{\rule{0.5em}{0ex}}\phantom{\rule{0.5em}{0ex}}\phantom{\rule{0.5em}{0ex}}\phantom{\rule{0.5em}{0ex}}\phantom{\rule{0.5em}{0ex}}\phantom{\rule{0.5em}{0ex}}\phantom{\rule{0.5em}{0ex}}{\mathbf{K}}_{\mathbf{x}}\succcurlyeq 0.\hfill & (12)\hfill \end{array}

The two newly added constraints (10) and (11) are deterministic and only involving explicit functions, which can be easily handled by numerical algorithms. The design variables here are now both the matrix {\mathbf{K}}_{\mathbf{x}} and the binary variables *b*_{
n
}, *n* = 1, 2, . . . , *N*, where the binary variables are used to indicate whether the interference outage check needs to be performed: if *b*_{
n
}= 0, it means no outage is possible under the scenario {\mathbf{g}}_{\mathbf{n}} given the constraint (10), such that *p*_{
n
}needs not to be included in the left-hand sum of (11); if *b*_{
n
}= 1, there may or may not be an outage if the slack constant *M* is chosen large enough, which leads to the fact that (11) is enforcing an outage probability upper-bound to be less than *p*_{th} since *p*_{
n
}is now always counted in the left-hand sum of (11). The positive slack constant, *M*, is chosen to be of a large value since it is used to deactivate the outage check in (10) when *b*_{
n
}= 1. Given the fact that {\sum}_{n=1}^{N}{b}_{n}{p}_{n} incurs an outage probability upper-bound, (P2) is actually a stricter version of (P1) with tighter constraints. As a result, the optimal objective value of (P2) will be slightly less than that of (P1). However, as we show later that the resulting performance is still much better than reference approaches.

We now discuss how to determine the value for *M*, which needs to guarantee the satisfaction of the inequality (10) when *b*_{
n
}= 1. It means that the value of *M* is chosen large enough to deactivate the constraint (10) when *b*_{
n
}= 1, i.e., the corresponding scenario in which the SU-Tx may cause harm to PU doesn't need to enforce the interference constraint (10) when we solve the problem (P2). For sufficiency, we could find an *M* that is larger than the maximum value of {\mathbf{g}}_{\mathbf{n}}^{\u2020}{\mathbf{K}}_{\mathbf{x}}{\mathbf{g}}_{\mathbf{n}} over *n* = 1, . . . , *N*. One way to achieve that is as follows:

\begin{array}{ll}{\mathbf{g}}_{\mathbf{n}}^{\u2020}{\mathbf{K}}_{\mathbf{x}}{\mathbf{g}}_{\mathbf{n}}=\text{tr}({\mathbf{g}}_{\mathbf{n}}^{\u2020}{\mathbf{K}}_{\mathbf{x}}{\mathbf{g}}_{\mathbf{n}})\hfill & (13)\hfill \\ \phantom{\rule{0.1em}{0ex}}\phantom{\rule{0.5em}{0ex}}\phantom{\rule{0.5em}{0ex}}\phantom{\rule{0.5em}{0ex}}\phantom{\rule{0.5em}{0ex}}=\text{tr}({\mathbf{K}}_{\mathbf{x}}{\mathbf{g}}_{\mathbf{n}}{\mathbf{g}}_{\mathbf{n}}^{\u2020})\hfill & (14)\hfill \\ \phantom{\rule{0.5em}{0ex}}\phantom{\rule{0.5em}{0ex}}\phantom{\rule{0.5em}{0ex}}\phantom{\rule{0.5em}{0ex}}\le \text{tr}({\mathbf{K}}_{\mathbf{x}})\text{tr}({\mathbf{g}}_{\mathbf{n}}{\mathbf{g}}_{\mathbf{n}}^{\u2020})\hfill & (15)\hfill \\ \phantom{\rule{0.5em}{0ex}}\phantom{\rule{0.5em}{0ex}}\phantom{\rule{0.5em}{0ex}}\phantom{\rule{0.5em}{0ex}}\le {P}_{\text{tr}1}\text{tr}({\mathbf{g}}_{\mathbf{n}}{\mathbf{g}}_{\mathbf{n}}^{\u2020}).\hfill & (16)\hfill \end{array}

Given above inequality, one way to choose *M* is to get different values for each scenario *n*. Instead of that, we conveniently choose one value of *M* for all scenarios and take M=\underset{n}{max}{P}_{\mathsf{\text{tr}}1}\mathsf{\text{tr}}\left({\mathbf{g}}_{\mathbf{n}}{\mathbf{g}}_{\mathbf{n}}^{\u2020}\right) since the only purpose of *M* is to deactivate the constraints. With the value of *M* available, we next solve the MBIP problem (P2), for which a direct approach is through exhaustive search over the binary variables *b*_{
n
}'s, where for each feasible choice of *b*_{
n
}'s we solve the following convex semi-definite programming (SDP) problem:

\begin{array}{ll}\underset{{\mathbf{K}}_{\mathbf{x}}}{\text{maximize}:}\text{tr}({\mathbf{K}}_{\mathbf{x}}\mathbf{h}{\mathbf{h}}^{\u2020})\hfill & (17)\hfill \\ \text{subject}\phantom{\rule{0.5em}{0ex}}\text{to}:\text{tr}({\mathbf{K}}_{\mathbf{x}})\le {P}_{\text{tr}1}\hfill & (18)\hfill \\ \phantom{\rule{0.5em}{0ex}}\phantom{\rule{0.5em}{0ex}}\phantom{\rule{0.5em}{0ex}}\phantom{\rule{0.5em}{0ex}}\phantom{\rule{0.5em}{0ex}}E[\text{tr}({\mathbf{K}}_{\mathbf{x}}\mathbf{g}{\mathbf{g}}^{\u2020})]\le {P}_{\text{tr}2}\hfill & (19)\hfill \\ \phantom{\rule{0.5em}{0ex}}\phantom{\rule{0.5em}{0ex}}\phantom{\rule{0.5em}{0ex}}\phantom{\rule{0.5em}{0ex}}\phantom{\rule{0.5em}{0ex}}\text{tr}({\mathbf{K}}_{\mathbf{x}}{\mathbf{g}}_{\mathbf{n}}{\mathbf{g}}_{\mathbf{n}}^{\u2020})\le M{b}_{n}+r,\phantom{\rule{0.5em}{0ex}}\phantom{\rule{0.5em}{0ex}}n=1,\dots ,\phantom{\rule{0.5em}{0ex}}N.\hfill & (20)\hfill \\ \phantom{\rule{0.5em}{0ex}}\phantom{\rule{0.5em}{0ex}}\phantom{\rule{0.5em}{0ex}}\phantom{\rule{0.5em}{0ex}}\phantom{\rule{0.5em}{0ex}}{\mathbf{K}}_{\mathbf{x}}\succcurlyeq 0.\hfill & (21)\hfill \end{array}

Unfortunately, such an exhaustive search in general incurs exponential total complexity. So instead, we discuss a BB approach to search over the binary variables more efficiently in the average sense.

### 3.2 Branch and bound algorithm

As mentioned before, one way to solve an MBIP problem is through exhaustive search, where the feasible space grows exponentially with the number of binary variables, which leads to the NP-hardness of most binary optimization problems. Fortunately, BB algorithms [26–28] can often be used in solving discrete and combinatorial optimization problems to reduce the average complexity, when the problem has a finite but very large solution set with certain structures.

We first give a brief overview of the BB algorithm, followed by specific implementations for solving the MBIP problem (P2). Two components are usually required for an effective implementation of a BB algorithm. The first is a *branching procedure* and the second is a *bounding function*. Given a set *S*, the branching procedure returns non-overlapping subsets *S*_{1}, *S*_{2}, . . . , whose union is the set *S*. The bounding function then computes the upper and/or lower bounds of the optimal value given each subset *S*_{
i
}. The upper and lower bounds are then used to determine one of the following two outcomes: split the subset *S*_{
i
}into more subsets for further bounding, or discard the subset *S*_{
i
}from the searching space, which is also referred to as pruning and is the reason why the BB algorithm is more efficient than exhaustive search.

It is clear that problem (P2) can be cast as a SDP problem over the design variable {\mathbf{K}}_{\mathbf{x}} when the binary variables are relaxed to be within [0,1]. With this property, we next implement the BB algorithm to jointly search over {\mathbf{K}}_{\mathbf{x}} and the binary variable *b*_{
n
}'s. Due to the recursive nature of the BB algorithm, it traverses a binary search tree (BST), as shown in Figure 2. Each *node* in the BST represents a particular case when the relaxed SDP problem from (P2) is solved at a partial or complete binary solution. In particular, the *root node* corresponds to the case where all *b*_{
n
}'s are relaxed to be within [0,1]; and a *leaf node* is a node at the bottom of the BST, which denotes the case with a complete binary solution, where the resulting objective value of (P2) is called an *incumbent* if it is the best objective value found so far across all the known leaf nodes. The *depth* of a node, *D*, is defined to be the number of determined binary variables in the partial binary solution at this node. As *D* increases from *D* = *j* to *D* = *j* + 1, one additional binary variable *b*_{
n
}is being determined. Specifically, at one particular node let us assume that *b*_{1}, *b*_{2}, . . . , *b*_{n- 1}have been determined. We then create two child nodes corresponding to two sub-problems in the relaxed SDP form of (P2) with *b*_{
n
}= 0 and *b*_{
n
}= 1, respectively, while keeping *b*_{1}, *b*_{2}, . . . , *b*_{n- 1}unchanged and rounding all undetermined binary variables, *b*_{n+1}, *b*_{n+2}, . . . , *b*_{
n
}, to be ones. For a given sub-problem, if the achieved optimal objective value is lower than the current incumbent, the corresponding child node (as well as all of its descendants) is discarded, i.e., pruned from the searching space. Otherwise, the corresponding child node is kept in the BST, and the searching continues to *b*_{n+1}until we reach the leaf node with a complete binary solution.

Following the above procedure, the BB algorithm traverses through the BST by solving one relaxed SDP for an optimal {\mathbf{K}}_{\mathbf{x}} at each node. The algorithm is terminated when the entire BST has been either pruned or processed. All computations in our algorithm are performed using the matlab-based software package CVX [29, 30] which deploys SeDuMi [31] as its back-end solver for SDP problems.

### 3.3 Complexity analysis

In this section, we discuss the complexity of the proposed algorithm. The efficiency of the algorithm depends critically on the branching and bounding procedure, where the entire searching space is branched into non-overlapping subsets, and the bounding procedure then calculates bounds for each subset with decisions made on whether to continue branching or to discard the entire subset. The pruning process, which allows the algorithm to only traverse a fraction of the entire BST, is the key to decrease the overall searching complexity. In our implementation, at the root node, there are no determined binary variables, i.e., all binary variables are relaxed. At each child node, one additional binary variable is determined. During each iteration, one node is chosen and the bound is calculated after solving the relaxed SDP. If the bound is lower than the incumbent, then it means that no child nodes branched from this node will yield a solution better than the incumbent; the node is therefore pruned. If the node at depth *j* is pruned, we can calculate how many potential child nodes of this branch are pruned, which indicates how much searching complexity is reduced. For simulations, we set *M*_{
t
}= 2. We assume that each element of {\mathbf{g}}_{\mathbf{n}} is generated by quantizing a random variable distributed as \mathcal{C}\mathcal{N}(0,0.1) into four levels, and the corresponding *p*_{
n
}is determined by integrating the probability density function over the associated quantization levels. The secondary transmit power ranges from 0 dB to 10 dB. Accordingly, the MBIP problem has 16 binary design variables, such that if exhaustive search is deployed, there will be a total of 2^{16} = 65536 sub-problems need to be solved. With our approach, Figure 3 shows the update progress of the incumbent, and Figure 4 shows the progress of pruned nodes in percentages at each iteration, where we only need to solve 273 sub-problems in the simulation.

*Remark*: The number of sub-problems solved in our BB algorithm varies over different channel realizations. Typically, we observe that less than 700 sub-problems in total are solved with our BB algorithm across a large number of channel realizations.

### 3.4 Rank-one property of the optimal K_{x}

Note that Zhang and Liang [8] studied a similar problem without the PU outage constraint, where they proved that the optimal {\mathbf{K}}_{\mathbf{x}} must be a rank-one matrix. To prove the rank-one property of the optimal matrix {\mathbf{K}}_{\mathbf{x}} in our case, we focus on the following equivalent problem to the relaxed SDP problem at each given set of *b*_{
n
}'s:

\begin{array}{ll}\text{(P3)}:\underset{{\mathbf{K}}_{\mathbf{x}}}{\text{maximize}}:\mathrm{log}(1+{\mathbf{h}}^{\u2020}{\mathbf{K}}_{\mathbf{x}}\mathbf{h})\hfill & (22)\hfill \\ \phantom{\rule{0.5em}{0ex}}\phantom{\rule{0.5em}{0ex}}\phantom{\rule{0.5em}{0ex}}\text{subjectto}:\text{tr}({\mathbf{K}}_{\mathbf{x}})\le {P}_{\text{tr}1}\hfill & (23)\hfill \\ \phantom{\rule{0.5em}{0ex}}\phantom{\rule{0.5em}{0ex}}\phantom{\rule{0.5em}{0ex}}\phantom{\rule{0.5em}{0ex}}\phantom{\rule{0.5em}{0ex}}\phantom{\rule{0.5em}{0ex}}\phantom{\rule{0.5em}{0ex}}\text{tr}({\mathbf{K}}_{\mathbf{x}}E[\mathbf{g}{\mathbf{g}}^{\u2020}])\le {P}_{\text{tr}2}\hfill & (24)\hfill \\ \phantom{\rule{0.5em}{0ex}}\phantom{\rule{0.5em}{0ex}}\phantom{\rule{0.5em}{0ex}}\phantom{\rule{0.5em}{0ex}}\phantom{\rule{0.5em}{0ex}}\phantom{\rule{0.5em}{0ex}}\phantom{\rule{0.5em}{0ex}}\text{tr}({\mathbf{K}}_{\mathbf{x}}{\mathbf{g}}_{\mathbf{n}}{\mathbf{g}}_{\mathbf{n}}^{\u2020})\le r,\phantom{\rule{0.5em}{0ex}}\forall n\in {T}_{1}\hfill & (25)\hfill \\ \phantom{\rule{0.5em}{0ex}}\phantom{\rule{0.5em}{0ex}}\phantom{\rule{0.5em}{0ex}}\phantom{\rule{0.5em}{0ex}}\phantom{\rule{0.5em}{0ex}}\phantom{\rule{0.5em}{0ex}}\phantom{\rule{0.5em}{0ex}}\text{tr}({\mathbf{K}}_{\mathbf{x}}{\mathbf{g}}_{\mathbf{m}}{\mathbf{g}}_{\mathbf{m}}^{\u2020})\le r+M,\forall m\in {T}_{2}\hfill & (26)\hfill \\ \phantom{\rule{0.5em}{0ex}}\phantom{\rule{0.5em}{0ex}}\phantom{\rule{0.5em}{0ex}}\phantom{\rule{0.5em}{0ex}}\phantom{\rule{0.5em}{0ex}}\phantom{\rule{0.5em}{0ex}}\phantom{\rule{0.5em}{0ex}}{\mathbf{K}}_{\mathbf{x}}\succcurlyeq 0,\hfill & (27)\hfill \end{array}

where the replacement of {\mathbf{h}}^{\u2020}{\mathbf{k}}_{\mathbf{x}}\mathbf{h}\phantom{\rule{0.5em}{0ex}}\text{by}\phantom{\rule{0.5em}{0ex}}\mathrm{log}(1+{\mathbf{h}}^{\u2020}{\mathbf{k}}_{\mathbf{x}}\mathbf{h}) in the objective function is for the convenience of applying the Karush-Kuhn-Tucker (KKT) optimality conditions [30], the set *T*_{1} contains all the indices with *b*_{
n
}= 0, and the set *T*_{2} contains all the indices with *b*_{
n
}= 1.

The Lagrange dual function of (P3) can thus be written as

\begin{array}{cc}\hfill \phantom{\rule{0.3em}{0ex}}g\left(\nu ,\theta ,{\mu}_{n},\phantom{\rule{2.77695pt}{0ex}}{\lambda}_{m}\right)& =\underset{{\mathbf{K}}_{\mathbf{x}}\succcurlyeq 0}{sup}\phantom{\rule{1em}{0ex}}log\left(1+{\mathbf{h}}^{\u2020}{\mathbf{K}}_{x}\mathbf{h}\right)\hfill \\ +\mathsf{\text{tr}}\left[{\mathbf{K}}_{\mathbf{x}}\left(\nu \mathbf{I}\right.\right.+\theta E\left[\mathbf{g}{\mathbf{g}}^{\u2020}\right]\hfill \\ +\left(\right)close="]">\left(\right)close=")">\sum _{n\in {T}_{1}}{\mu}_{n}{\mathbf{g}}_{\mathbf{n}}{\mathbf{g}}_{\mathbf{n}}^{\u2020}+\sum _{m\in {T}_{2}}{\lambda}_{m}{\mathbf{g}}_{\mathbf{m}}{\mathbf{g}}_{\mathbf{m}}^{\u2020}\hfill \\ ,\end{array}\n

(28)

where *ν*, *θ*, *μ*_{
n
} , and *λ*_{
m
} are the dual variables associated with the constraints (23)-(26), respectively. We then define matrix \mathbf{A} as

\mathbf{A}=\nu \mathbf{I}+\theta E\left[\mathbf{g}{\mathbf{g}}^{\u2020}\right]+\sum _{n\in {T}_{1}}{\mu}_{n}{\mathbf{g}}_{\mathbf{n}}{\mathbf{g}}_{\mathbf{n}}^{\u2020}+\sum _{m\in {T}_{2}}{\lambda}_{m}{\mathbf{g}}_{\mathbf{m}}{\mathbf{g}}_{\mathbf{m}}^{\u2020},

(29)

and we show that \mathbf{A} must have a full rank of *M*_{
t
}in order for the dual function to have a bounded value. First, it is clear that in the case of either *ν >* 0 or *θ* > 0, \mathbf{A} must have full rank. When both *ν* = 0 and *θ* = 0, we prove that \mathbf{A} also needs to have full rank by a contradiction approach discussed in [32]. Let us assume \mathbf{A} is rank deficient; then it is possible to have a {\mathbf{K}}_{\mathbf{x}}=t{\mathbf{v}}_{\mathbf{j}}{\mathbf{v}}_{\mathbf{j}}{}^{\u2020}, where {\mathbf{v}}_{\mathbf{j}} is an eigenvector of \mathbf{A} corresponding to a zero eigenvalue and *t* is some scaling coefficient. As such, the trace term on the right-hand side of (28) goes to zero. Since \mathbf{h} is drawn from a continuous distribution, the probability that \mathbf{h} is orthogonal to **v**_{
j
} is zero. It thus follows that the supremum in (28) would be unbounded by choosing an appropriate polarity of *t* and scaling up the magnitude of *t* to infinity. As such, we conclude that \mathbf{A} must have full rank, which allows us to define a new design variable {\stackrel{\u0304}{\mathbf{K}}}_{\mathbf{x}}={\mathbf{A}}^{\frac{1}{2}}{\mathbf{K}}_{\mathbf{x}}{\mathbf{A}}^{\frac{1}{2}} and rewrite the Lagrange dual function as the optimal value of the following problem:

\underset{{\stackrel{\u0304}{\mathbf{K}}}_{x}}{maximize}:log\left(1+{\mathbf{h}}^{\u2020}{{\mathbf{A}}^{-}}^{\frac{1}{2}}{\stackrel{\u0304}{\mathbf{K}}}_{\mathbf{x}}{{\mathbf{A}}^{-}}^{\frac{1}{2}}\mathbf{h}\right)+\mathsf{\text{tr}}\left({\stackrel{\u0304}{\mathbf{K}}}_{\mathbf{x}}\right)

(30)

\mathsf{\text{subjectto:}}{\stackrel{\u0304}{\mathbf{K}}}_{\mathbf{x}}\succcurlyeq 0.

(31)

This problem is convex and has strictly feasible points; thus the optimal {\stackrel{\u0304}{\mathbf{K}}}_{\mathbf{x}} must satisfy the KKT conditions [30] as follows,

\phantom{\rule{0.3em}{0ex}}\frac{1}{ln2}{\left({{\mathbf{A}}^{-}}^{\frac{1}{2}}\right)}^{\u2020}\mathbf{h}{\left(1+{\mathbf{h}}^{\u2020}{{\mathbf{A}}^{-}}^{\frac{1}{2}}{\stackrel{\u0304}{\mathbf{K}}}_{\mathbf{x}}{{\mathbf{A}}^{-}}^{\frac{1}{2}}\mathbf{h}\right)}^{-1}{\mathbf{h}}^{\u2020}{\left({{\mathbf{A}}^{-}}^{\frac{1}{2}}\right)}^{\u2020}+\mathbf{\Phi}=-\mathbf{I}

(32)

\mathsf{\text{tr}}\left(\mathbf{\Phi}{\stackrel{\u0304}{\mathbf{K}}}_{\mathbf{x}}\right)=0,

(33)

where \Phi \succcurlyeq 0 is the dual variable associated with the constraint (31). Here, we see that since the right-hand side of (32) is a matrix of full rank *M*_{
t
}, and the first term on the left-hand side has unit rank, the matrix \Phimust be of a rank greater than or equal to *M*_{
t
}- 1. Given {\stackrel{\u0304}{\mathbf{K}}}_{\mathbf{x}}\succcurlyeq 0 and **Φ** ≽ 0, together with (33), we conclude that the rank of the nontrivial optimal {\stackrel{\u0304}{\mathbf{K}}}_{\mathbf{x}}, and also the optimal {\mathbf{K}}_{\mathbf{x}}, is one. Since the above result holds for all of the feasible dual variables, when the *ν*, *θ*, *μ*_{
n
} , and *λ*_{
m
}are taking the optimal values in the dual problem, the resulting optimal solution of {\mathbf{K}}_{\mathbf{x}} from the optimal {\stackrel{\u0304}{\mathbf{K}}}_{\mathbf{x}} in (30), (31) is also the optimal solution for the original problem in (22)-(27), which is of rank one. As such, beamforming is optimal for the CR transmitter even under the PU outage probability constraint, where the optimal beamformer can be directly obtained as the eigenvector of the rank-one optimal {\mathbf{K}}_{\mathbf{x}}.

### 3.5 An efficient heuristic algorithm

As an alternative to the high-complexity BB-based solution, we propose a heuristic but efficient algorithm for finding a suboptimal solution of the MBIP problem (P2). By observing the objective function and the constraint (10) in (P2), we see that we will severely limit the SU received signal power when we limit the interference to the PU via (10) in the case of a {\mathbf{g}}_{\mathbf{n}} that is highly correlated with \mathbf{h}. To prevent this, we could manually set such a case as an outage scenario with *b*_{
n
}= 1 as long as the outage probability constraint is still satisfied. By doing so, the corresponding constraint {\mathbf{g}}_{\mathbf{n}}^{\u2020}{\mathbf{K}}_{\mathbf{x}}{\mathbf{g}}_{\mathbf{n}}\le r+M becomes inactive with a large *M*, such that no power restriction is enforced over the correlated direction of \mathbf{h}. With the above approach applied to a part of {\mathbf{g}}_{\mathbf{n}}\text{'}\text{s} correlated with \mathbf{h}, we could achieve a good balance between maximizing the SU rate and protecting the PU, where the philosophy is that since certain PU outage is allowed, we should greedily utilize such an outage allowance to cover certain {\mathbf{g}}_{\mathbf{n}}\text{'}\text{s} that are aligned in a similar direction to \mathbf{h}. Note that the comment on the rank of {\mathbf{K}}_{\mathbf{x}} given in the last subsection is also applicable to this heuristic algorithm and the beamforming is the optimal transmission strategy. Specifically, we use the angles between {\mathbf{g}}_{\mathbf{n}}\text{'}\text{s} and \mathbf{h}, defined by cos\left({\theta}_{n}\right)=\frac{{\stackrel{\u0304}{\mathbf{g}}}_{\mathbf{n}}^{\u2020}\cdot \stackrel{\u0304}{\mathbf{h}}}{\left|\right|{\mathbf{g}}_{\mathbf{n}}\left|\right|\phantom{\rule{2.77695pt}{0ex}}\left|\right|\mathbf{h}\left|\right|}, with {\stackrel{\u0304}{\mathbf{g}}}_{\mathbf{n}}={\left[\mathsf{\text{Re}}\left({\mathbf{g}}_{\mathbf{n}}\right)\mathsf{\text{Im}}\left({\mathbf{g}}_{\mathbf{n}}\right)\right]}^{\u2020} and \stackrel{\u0304}{\mathbf{h}}={\left[\mathsf{\text{Re}}\left(\mathbf{h}\right)\mathsf{\text{Im}}\left(\mathbf{h}\right)\right]}^{\u2020}, as a measure for the correlation of directions. The smaller angle means that the direction of {\mathbf{g}}_{\mathbf{n}} is closer to \mathbf{h}. The proposed algorithm first sorts the set of \{{\mathbf{g}}_{1},\phantom{\rule{0.5em}{0ex}}{\mathbf{g}}_{2},\phantom{\rule{0.5em}{0ex}}\dots ,\phantom{\rule{0.5em}{0ex}}{\mathbf{g}}_{\mathbf{N}}\} in descending order of |cos(*θ*_{
n
})*|* and forms a new set \left\{{\stackrel{\u0303}{\mathbf{g}}}_{\mathbf{1}},{\stackrel{\u0303}{\mathbf{g}}}_{\mathbf{2}},\phantom{\rule{2.77695pt}{0ex}}\dots ,{\stackrel{\u0303}{\mathbf{g}}}_{\mathbf{N}}\right\} with a corresponding probability set \left\{{\stackrel{\u0303}{p}}_{1},\phantom{\rule{2.77695pt}{0ex}}{\stackrel{\u0303}{p}}_{2},\phantom{\rule{2.77695pt}{0ex}}\dots ,\phantom{\rule{2.77695pt}{0ex}}{\stackrel{\u0303}{p}}_{N}\right\}. The {\stackrel{\u0303}{b}}_{n} values are all initialized to zeros. Starting with {\stackrel{\u0303}{\mathbf{g}}}_{\mathbf{1}}, which has the smallest angle between \mathbf{h}, named the highest correlation to \mathbf{h} relative to other {\stackrel{\u0303}{\mathbf{g}}}_{\mathbf{n}}'s in this article, we add this scenario to the outage probability by setting the corresponding {\stackrel{\u0303}{b}}_{1} to one, as long as doing so does not violate the sum probability constraint {\sum}_{n=1}^{N}{\stackrel{\u0303}{b}}_{n}{\stackrel{\u0303}{p}}_{n}\le {p}_{\mathsf{\text{th}}}, otherwise {\stackrel{\u0303}{b}}_{1} is set to zero. This process continues sequentially for the set of \left\{{\stackrel{\u0303}{\mathbf{g}}}_{\mathbf{1}},{\stackrel{\u0303}{\mathbf{g}}}_{\mathbf{2}},\phantom{\rule{2.77695pt}{0ex}}\dots ,{\stackrel{\u0303}{\mathbf{g}}}_{\mathbf{N}}\right\}, which results in a pre-determined set of {\stackrel{\u0303}{b}}_{n}'s that satisfies the sum outage constraint. And the convex SDP problem in the form of (P3) with the pre-determined binary variables can be solved to get the optimal covariance matrix {\mathbf{K}}_{\mathbf{x}}. Although the optimality (with respect to the solution of (P2)) of the SU rate obtained by solving the above resulting problem is not guaranteed, the heuristic algorithm does offer an efficient solution to an otherwise complex problem by solving only one SDP problem. Numerical results in the following section show the encouraging performance of this heuristic algorithm in comparison with the BB and reference algorithms.