The MSK symbols are sinusoids over one symbol interval $T$ with different frequencies $\omega_i$: $$g_i(t)=\sin(\omega_i t),\quad 0\le t < T\tag{1}$$ The pulses $g_i(t)$ are orthogonal if $$\int_{-\infty}^{\infty}g_i(t)g_j(t)dt = 0,\quad i\neq j\tag{2}$$ is satisfied. Using the pulses from (1) gives $$\int_{0}^{T}\sin(\omega_i t)\sin(\omega_j t)dt = 0,\quad i\neq j\tag{3}$$ Expanding (3) results in the condition $$\frac12\int_{0}^{T}cos(\omega_i-\omega_j)dt-\frac12\int_{0}^{T}cos(\omega_i+\omega_j)dt = 0,\quad i\neq j$$ or, equivalently, $$\sin[(\omega_i-\omega_j)T]=0,\quad i\neq j\\ \sin[(\omega_i+\omega_j)T]=0,\quad i\neq j$$ which finally results in the conditions $$\Delta\omega=\omega_i-\omega_j=\frac{m\pi}{T},\quad m=1,2,\ldots\\ \omega_c=(\omega_i+\omega_j)/2=\frac{n\pi}{2T},\quad n=1,2,\ldots\tag{4}$$ where $\Delta\omega$ is the frequency separation, and $\omega_c$ is the nominal carrier frequency. From (4), the minimum frequency separation guaranteeing orthogonality is obtained for $m=1$: $$\Delta\omega=\frac{\pi}{T}\tag{5}$$ This frequency separation is half the frequency separation of traditional FSK ("Sunde's FSK"). This is the reason why this technique is called minimum shift keying. Due to the smaller frequency separation, MSK has a higher bandwidth efficiency than Sunde's FSK.
In the following I will restrict the discussion to binary MSK. With $\omega_c$ and $\Delta\omega$ given by (4) and (5) we can write the MSK signal as \begin{align} s(t)&=\cos[\omega_ct+b(t)\cdot\frac{\Delta\omega}{2}\cdot t+\phi(t)]\\ &= \cos[\omega_ct+b(t)\cdot\frac{\pi t}{2T}+\phi(t)]\tag{6} \end{align} where $b(t)$ assumes the values $+1$ and $-1$, depending on the bit sequence: $$b(t)=\sum_kA_kp(t-kT),\quad A_k\in\{+1,-1\}$$ with the rectangular pulse $p(t)$ being $1$ in the interval $[0,T]$ and zero everywhere else. The piecewise constant function $\phi(t)$ in (6) is necessary to guarantee that the MSK signal has a continuous phase. At $t=kT$ the information bit (and the function value of $b(t)$) changes from $A_{k-1}$ to $A_k$ and the value of the phase function $\phi(t)$ changes from $\phi_{k-1}$ to $\phi_k$. In order for the total phase of the MSK signal (6) to be continuous, the following equation must be satisfied: $$\omega_ckT+A_{k-1}\frac{\pi kT}{2T}+\phi_{k-1}=\omega_ckT+A_{k}\frac{\pi kT}{2T}+\phi_{k}$$ which results in $$\phi_k=\phi_{k-1}+(A_{k-1}-A_k)\frac{k\pi}{2}\quad(\textrm{mod }2\pi)\tag{7}$$ Since $|A_{k-1}-A_k|$ is either $0$ or $2$, according to (7) $\phi(t)$ can only change (modulo $2\pi$) when $k$ is odd and if $A_{k-1}\neq A_k$. So we can rewrite (7) as $$\phi_k=\begin{cases}\phi_{k-1},& k\textrm{ even OR }A_k=A_{k-1}\\ \phi_{k-1}\pm\pi,& k\textrm{ odd AND }A_k\neq A_{k-1}\end{cases}\tag{8}$$ One important conclusion from (8) is that the discontinuous function $\phi(t)$ defined by the values $\phi_k$ changes between the values $0$ and $\pm\pi$ (if we assume $\phi_0=0$), and, even more importantly, it can only change when $k$ is odd. This means that it changes at half the symbol rate (or, because we consider binary MSK, half the bit rate), i.e. its rate is $1/2T$.
The following figure shows a possible information signal $b(t)$ and the corresponding in-phase and quadrature components $a_I(t)$ and $a_Q(t)$, respectively. It can be seen that both $a_I(t)$ and $a_Q(t)$ have a rate of $1/2T$ and that both components never change at the same time, i.e. they are offset by one bit period.
The next figure shows the sinusoidally pulse-shaped signals $a_I(t)\cos\left(\frac{\pi t}{2T}\right)$ and $a_Q(t)\sin\left(\frac{\pi t}{2T}\right)$, and the final MSK-signal $s(t)$: