Solving Equations of degree less than 4


The main problem is the solution of polynomial equations of one variable $$ \begin{equation}\label{1.0.0} X^n + a_{1} X^{n-1} + · · · + a_n = 0 \end{equation} $$ with particular interest in the solution "by radicals", that is to say to the solution that uses only roots $\sqrt[m]{a}$. It is well known since the 16th century that radical equations can be solved for degrees $n ≤ 4$. On the other hand, according to a famous result of Abel, the general equation of degree $n ≥ 5$ is not solvable by radicals. The main idea of ​​Galois theory is to associate with each equation its symmetry group. This construction makes it possible to translate properties of the equation (such as solution by radicals) to the associated group.

Consider the equation \eqref{1.0.0} of degree $n\geq 1$ with complex coefficients $a_1,. . . , a_n \in\mathbb{C}$. According to a fundamental result of Gauss, the equation \eqref{1.0.0} precisely admits n complex roots $x_1,. . . , x_n$ (which are not, in general, distinct) in the sense that one has the polynomial identity $$ \begin{equation}\label{1.1.2} X^n+a_1X^{n-1}+\ldots+a_n=(X-x_1)\cdots (X-x_n) \end{equation} $$ By expanding the term on the right and comparing the coefficients, we obtain the following relations: $$\begin{equation}\label{1.1.3} a_1=-\sigma_1,a_2=\sigma_2,\ldots,a_n=(-1)^n\sigma_n\end{equation} $$ where $\sigma_i$ defined below $$ \begin{align} \sigma_1&=x_1+x_2+\ldots+x_n=\sum_ix_i\label{1.1.4}\\ \sigma_2&= x_1x_2+x_1x_3+\ldots+x_{n-1}x_n=\sum_{i<j}x_ix_j\nonumber\\ \vdots~~~ & \qquad\vdots \nonumber\\ \sigma_k& = \sum_{1\leq i_1\leq\ldots\leq i_k\leq n}x_{i_1}\cdots x_{i_k}\\ \vdots~~~ & \qquad\vdots \nonumber\\ \sigma_n&=x_1\cdots x_n\nonumber \end{align} $$ are the elementary symmetric polynomials of the roots $x_1,. . . , x_n$. In other words, the solution of \eqref{1.0.0} is equivalent to the solution of the system \eqref{1.1.4} with $\sigma_k=(-1)^ka_k$. The functions in \eqref{1.1.4} are symmetric in $x_1,\ldots,x_n$. Following Lagrange, the system of equations \eqref{1.1.4} can be solved by gradually breaking their symmetry. As explained below, this method succeeds if $n ≤ 4$.



Corresponding to $X^2+a_1X+a_2=0$ the system to be solved is $$ x_1 + x_2 = -a_1, \qquad x_1 x_2 = a_2. $$ We consider the function $$ \begin{equation}\label{1.2.2} y = x_1 - x_2 \end{equation} $$ which is not symmetrical in $x_1$ and $x_2$, but its square is: $$ \begin{align*} y^2 &= (x_1 - x_2)^2 = x^2_1 + x^2_2 - 2x_1 x_2 \\ &= (x_1 + x_2)^2 - 4x_1 x_2 = \sigma_1^2 - 4\sigma_2 = a^2_1 - 4a_2, \end{align*} $$ hence the well-known formulas: $$ y = \pm\sqrt{a^2_1-4a_2}, \qquad x_1, x_2 =\dfrac{(x_1+ x_2)\pm y}{2}=\dfrac{(x_1+ x_2) \pm\sqrt{a^2_1-4a_2}}{2} $$


Corresponding to the equation $$\begin{equation}\label{1.3.1.1}X^3 + a_1 X^2 + a_2 X + a_3 = 0,\end{equation}$$ solve the system $$ \begin{align} \sigma_1 &= x_1 + x_2 + x_3 = -a_1,\label{1.3.1.2}\\ \sigma_2 &= x_1 x_2 + x_1 x_3 + x_2 x_3 = a_2,\\ \sigma_3 &= x_1 x_2 x_3 = -a_3.\\ \end{align} $$ It is natural to generalize \eqref{1.2.2} as follows: set $$ \begin{align} y_1 &= x_1 + \rho x_2 + \rho^2 x_3\nonumber\\ y_2 &= x_1 + \rho^2 x_2 + \rho x_3,\label{1.3.1.3} \end{align} $$ where $\rho=\exp(2\pi\ci/3)=(-1+\ci\sqrt{3})/2$ the third root of unity and $\rho^2=\rho^{-1}=\exp(-2\pi\ci/3)=-1-\rho$. Recall that $\exp(\ci\theta)= \cos\theta+\ci\sin\theta$.

What are the symmetries of the functions $y_1, y_2$? If we exchange $x_1$ and $x_2$ (resp., $x_2$ and $x_3$), the functions $y_1, y_2$ will be transformed according to the formulas $$ \begin{align*} y_1& \to x_2 + \rho x_1 + \rho^2 x_3 = \rho y_2,\\ y_2 & \to x_2 + \rho^2 x_1 + \rho x_3 = \rho^2 y_1\\ \text{resp.}&\\ y_1 &\to x_1 + \rho x_3 + \rho^2 x_2 = y_2,\\ y_2 &\to x_1 + \rho^2 x_3 + \rho x_2 = y_1. \end{align*} $$ As a result, the functions $y_1 y_2$ and $y_1^3 + y_2^3$ are symmetric at $x_1, x_2, x_3$ (or see directly from \eqref{1.3.1.3}): $$ \begin{align*} y_1 y_2& = x^2_1 + x^2_2 + x^2_3 - x_1 x_2 - x_1 x_3 - x_2 x_3\\ y_1^3 + y_2^3 &= 2 (x^3_1 + x^3_2 + x^3_3) + 12x_1 x_2 x_3 \\ &- 3 (x^2_1 x_2 + x_1 x^2_2 + x^2_1 x_3 + x_1 x^2_3 + x^2_2 x_3 + x_2 x^2_3) \end{align*} $$ Can we express them as functions of $\sigma_1, \sigma_2, \sigma_3$? Let $$ \begin{align*} s_k &= x^k_1 + x^k_2 + x^k_3,\\ s_{2,1}& = x^2_1 x_2 + x_1 x^2_2 + x^2_1 x_3 + x_1 x^2_3 + x^2_2 x_3 + x_2 x^2_3,\\ \text{Thus,}&\\ y_1^3 + y_2^3& = 2s_3 + 12\sigma_3 - 3s_{2,1}\\ y_1 y_2 &= s_2 - \sigma_2, \end{align*} $$ $s_2, s_3$ and $s_{2,1}$ can be expressed as a function of $\sigma_1, \sigma_2$ and $\sigma_3$: we have $$ \begin{align*} \sigma_1^2& = (x_1 + x_2 + x_3)^2 = x^2_1 + x^2_2 + x^2_3 + 2 (x_1 x_2 + x_1 x_3 + x_2 x_3) \\ &= s_2 + 2\sigma_2\\ \sigma_1 \sigma_2 &= (x_1 + x_2 + x_3) (x_1 x_2 + x_1 x_3 + x_2 x_3)= s_{2,1} + 3x_1 x_2 x_3 \\ & = s_{2,1} + 3\sigma_3\\ \sigma_1 s_2 &= (x_1 + x_2 +x_3) ​​(x^2_1+x^2_2+x^2_3)=(x^3_1+x^3_2+x^3_3)+ s_{2,1}\\& = s_3 + s_{2,1} \end{align*} $$ From above $$ \begin{align*} s_2 = \sigma_1^2 - 2\sigma_2,&\quad s_{2,1} = \sigma_1 \sigma_2 - 3\sigma_3\\ s_3 = \sigma_1 s_2 - s_{2,1} &= \sigma_1^3 - 3\sigma_1 \sigma_2 + 3\sigma_3 \end{align*} $$ and $$ \begin{equation}\label{1.3.1.7} y_1 y_2 = \sigma_1^2 - 3\sigma_2, \qquad y_1^3 + y_2^3 = 2\sigma_1^3 - 9\sigma_1 \sigma_2 + 27\sigma_3 \end{equation} $$ It is a fact that any symmetric polynomial $F (x_1, ..., x_n)$ can be expressed as a function of $\sigma_1, ..., \sigma_n$. In summary, cubes $y_1^3, y_2^3$ are the roots of the quadratic equation $$(t - y_1^3) (t - y_2^3) = t^2 - (2\sigma_1^3 - 9\sigma_1 \sigma_2 + 27\sigma_3) t + (\sigma_1^2 - 3\sigma_2)^3 = 0$$ with sum and product of roots as given above.



The equation $X^3+a_1X^2+a_2X+a_3=0$ can be transformed into $$X^3 + pX + q = 0$$ by the substitution $X \mapsto X + a_1 / 3$. Applying the general definition of $\sigma$ as in \eqref{1.1.3} we get $$ \sigma_1 = 0,\qquad \sigma_2 = p, \qquad \sigma_3 = -q, \qquad $$ therefore \eqref{1.3.1.7} is written $$ \begin{equation}\label{1.3.2.2} y_1 y_2 = -3p,\qquad y_1^3 + y_2^3 = -27q \end{equation} $$ and $y_1^3, y_2^3$ are the roots of the quadratic equation $$ (t - y_1^3) (t - y_2^3) = t^2 + 27qt - 27p^3 = 0, $$ now following the section on equations of degree 2 we solve to get the roots as $$ y_1^3, y_2^3 = 27 \left(-\frac{q}{2}\pm \sqrt{\left(\frac{p}{q}\right)^3+\left(\frac{q}{2}\right)^2} \right) $$ and taking 3rd roots gives $y_1,y_2$. Using the results \eqref{1.3.1.2} and \eqref{1.3.1.3} gives $$ \begin{align*} y_1 + y_2+\sigma_1 &= 3x_1, \\ \rho^2 y_1 + \rho y_2+\sigma_1 &= 3x_2,\\ \rho y_1 + \rho^2 y_2 +\sigma_1&= 3x_3\\ \end{align*} $$ but $\sigma_1=0$ (in our case), and the above can be solved to get the roots $x_1, x_2, x_3$ given by the following formulas (called "Cardano's" formulas): $$ \begin{align*} x_1=\sqrt[3]{-\frac{q}{2}+ \sqrt{\left(\frac{p}{q}\right)^3+\left(\frac{q}{2}\right)^2} }+\sqrt[3]{-\frac{q}{2}- \sqrt{\left(\frac{p}{q}\right)^3+\left(\frac{q}{2}\right)^2} }\\ x_2=\rho^2\sqrt[3]{-\frac{q}{2}+ \sqrt{\left(\frac{p}{q}\right)^3+\left(\frac{q}{2}\right)^2} }+\rho\sqrt[3]{-\frac{q}{2}- \sqrt{\left(\frac{p}{q}\right)^3+\left(\frac{q}{2}\right)^2} }\\ x_3=\rho\sqrt[3]{-\frac{q}{2}+ \sqrt{\left(\frac{p}{q}\right)^3+\left(\frac{q}{2}\right)^2} }+\rho^2\sqrt[3]{-\frac{q}{2}- \sqrt{\left(\frac{p}{q}\right)^3+\left(\frac{q}{2}\right)^2} }\\ \end{align*} $$ Here the cubic roots are normalized by the first equation of \eqref{1.3.2.2} that is, $$ \sqrt[3]{-\frac{q}{2}+ \sqrt{\left(\frac{p}{q}\right)^3+\left(\frac{q}{2}\right)^2} }~~\cdot~~ \sqrt[3]{-\frac{q}{2}- \sqrt{\left(\frac{p}{q}\right)^3+\left(\frac{q}{2}\right)^2} }=-\frac{p}{3} $$


Consider the equation $X^3 - 8X - 8 = 0$ which has an obvious root $x_1 = -2$, so it factorizes as $$\begin{align*} X^3 - 8X - 8& = (X + 2) (X^2 - 2X - 4)\\& = (X + 2) (X - (1 +\sqrt{5}))(X - (1 -\sqrt{5})) \end{align*} $$ or the roots are $$\begin{equation}\label{1.3.3.2}x_1=-2,\qquad x_2=1+\sqrt{5},\qquad x_3=1-\sqrt{5}\end{equation}$$ On the other hand, Cardan's formulas for $p = q = -8$ gives $$ \begin{align*} x_1=\sqrt[3]{4+\frac{4\ci}{9} \sqrt{15} }+\sqrt[3]{4-\frac{4\ci}{9} \sqrt{15} }\\ x_2=\rho^2\sqrt[3]{4+\frac{4\ci}{9} \sqrt{15} }+\rho\sqrt[3]{4-\frac{4\ci}{9} \sqrt{15} }\\ x_3=\rho\sqrt[3]{4+\frac{4\ci}{9} \sqrt{15} }+\rho^2\sqrt[3]{4-\frac{4\ci}{9} \sqrt{15} }\\ \end{align*} $$ with normalization $$ \sqrt[3]{4+\frac{4\ci}{9} \sqrt{15} }~~\cdot~~ \sqrt[3]{4-\frac{4\ci}{9} \sqrt{15} }=\frac{8}{3} $$ It is not at all obvious that we can simplify the above values to get the roots \eqref{1.3.3.2}. Of course, formulas \eqref{1.3.1.3} show that $$ y_1, y_2 = -3 \pm\ci\sqrt{15}, $$ from where dividing by $3$ and recalling the formulas for $y_1^3,y_2^3$ gives $$ \sqrt[3]{4\pm\frac{4\ci}{9} \sqrt{15} } = -1 \pm\frac{\ci\sqrt{15}}{3}, $$ which then helps us reconcile the two roots obtained. But it is not possible to deduce the above without having already found the roots \eqref{1.3.3.2}, which is needed to get $y_1,y_2$ from \eqref{1.3.1.3}.



Let us proceed to equation of degree $4$ given as $$\begin{equation}\label{1.4.1.1}X^4+a_1X^3+a_2X^2+a_3X+a_4=0 \end{equation}$$ The substitution $X \mapsto X + a_1/4$ gives the case $a_1=0$ that is the equation $$X^4 + pX^2 + qX + r = 0,$$ for which $$ \begin{equation}\label{1.4.1.2} \sigma_1 = 0,\qquad \sigma_2 = p,\qquad \sigma_3 = -q,\qquad \sigma_4 = r. \end{equation} $$ In generalizing \eqref{1.3.1.3}, consider the behavior of linear expressions $$ x_1 + \ci x_2 - x_3 - \ci x_4,\qquad x_1 - x_2 + x_3 - x_4,\qquad x_1 - \ci x_2 - x_3 + \ci x_4 $$ relative to changes in variables $$\begin{equation}\label{1.4.1.3} x_1 \longleftrightarrow x_2,\qquad x_2 \longleftrightarrow x_3,\qquad x_3 \longleftrightarrow x_4.\qquad \end{equation} $$ It's easy to see that the set of linear polynomials $$ u_1 = x_1 + x_2 - x_3 - x_4,\quad u_2 = x_1 + x_3 - x_2 - x_4,\quad u_3 = x_1 + x_4 - x_2 - x_3\quad $$ is preserved by substitutions\eqref{1.4.1.3} (upto a sign, and preserved completely in $u_i^2$ ). As a result, the coefficients of the cubic polynomial in $u$ $$ (u - u^2_1) (u - u^2_2) (u - u^2_3) $$ is symmetric in $x_1, x_2, x_3, x_4$. Now, consider a slightly modified polynomial: $$\begin{equation}\label{1.4.1.5} (y - y_1) (y - y_2) (y - y_3) = y^3 + b_1 y^2 + b_2 y + b_3, \end{equation} $$ where $$ y_1 = x_1 x_2 + x_3 x_4,\qquad y_2 = x_1 x_3 + x_2 x_4,\qquad y_3 = x_1 x_4 + x_2 x_3.\qquad $$ In fact, the following holds by noticing that $x_1+x_2+x_3+x_4=\sigma_1=0$, for example use $x_1+x_2=-(x_3+x_4)$ for the first equation $$ \begin{align} - (u_1 / 2)^2 = - (x_1 + x_2)^2 = (x_1 + x_2) (x_3 + x_4) = y_2 + y_3\label{1.4.1.6}\\ - (u_2 / 2)^2 = - (x_1 + x_3)^2 = (x_1 + x_3) (x_2 + x_4) = y_1 + y_3\nonumber\\ -(u_3 /2)^2 = −(x_1 + x_4 )^2 = (x_1 + x_4 )(x_2 + x_3 ) = y_1 + y_2 \nonumber, \end{align} $$ Let $s_{i_1\ldots i_k} $ denote the symmetrization of $x_1^{i_1} · · · x^{i_k}_k$, then the coefficients of \eqref{1.4.1.5} can be expressed as $$\begin{align*} -b1 &= y_1 + y_2 + y_3\\& = \sigma_2\\ b_2 &= y_1 y_2 + y_1 y_3 + y_2 y_3 \\ &= x^2_1 x_2 x_3 + · · · + x_2 x_3 x^2_4\\ & = s_{2,1,1}\\ -b_3 &= y_1 y_2 y_3 \\&= (x^3_1 x_2 x_3x_4 + · · · + x_1 x_2 x_3 x^3_4) \\&+ (x^2_1 x^2_2 x^2_3 + · · · + x^2_2 x^2_3 x^2_4) \\&= s_{3,1,1,1} + s_{2,2,2} , \end{align*} $$ The formulas $$ \begin{align*} \sigma_1 \sigma_3 &= (x_1 + · · · + x_4) (x_1 x_2 x_3 + · · · + x_2 x_3 x_4)\\& = s_{2,1,1} + 4x_1 x_2 x_3 x_4 \\&= s_{2,1,1} + 4\sigma_4\\ \sigma_3^2 &= (x_1 x_2 x_3 + · · · + x_2 x_3 x_4)^2 \\&= x^2_1 x^2_2 x^2_3 + · · · + x^2_2 x^2_3 x^2_4 + 2 (x^2_1 x^2_2 x_3 x_4 + · · · + x_2 x_3 x^2_3 x^2_4)\\& = s_{2,2, 2} + 2\sigma_2 \sigma_4\\ \sigma_4 s_2 &= x_1 x_2 x_3 x_4 (x^2_1 +x_2^2+x_3^2 + x^2_4) \\&= x^3_1 x_2 x_3 x_4 + · · · + x_1 x_2 x_3 x^3_4 \\&= s_{3,1,1,1}\\ \end{align*} $$ show that we have in general (without assuming $\sigma_1 = 0$) $$ \begin{align*} s_{2,1,1} &= \sigma_1 \sigma_3 - 4\sigma_4,\quad s_{2,2,2} = \sigma_3^2 - 2\sigma_2 \sigma_4,\\ s_{3,1,1,1} &= \sigma_4 s_2 = \sigma_4 (\sigma_1^2 - 2\sigma_2) \end{align*} $$ where $\sigma_1=x_1+x_2+x_3+x_4$ and $\sigma_2=\sum_{i\neq j}x_ix_j$ where $1\leq i,j\leq 4$, and substituting the above we get $$ b_1 = -\sigma_2,\qquad b_2 = \sigma_1 \sigma_3 - 4\sigma_4,\qquad b_3 = -\sigma_1^2 \sigma_4 + 4\sigma_2 \sigma_4 - \sigma_3^2,\qquad $$ using \eqref{1.4.1.2} above gives $$ b_1 = -p,\qquad b_2 = -4r,\qquad b_3 = 4pr - q 2\qquad $$ In summary, the expressions $y_1, y_2, y_3$ are the roots of the cubic equation $$\begin{equation}\label{1.4.1.9} y^3 - py^2 - 4ry + (4pr - q^2) = 0 \end{equation} $$ The above is called the resolving cubic equation of the quartic equation \eqref{1.4.1.1}. As soon as we know how to solve \eqref{1.4.1.9}, we deduce from \eqref{1.4.1.6} the values ​​of $x_i + x_j (i\neq j)$, hence those of $x_i$.


This page is translation of content from Théorie de Galois by Jan Nekovář.