Monday, 16 February 2015

general relativity - Matrix derivative of a matrix with constraints


I am looking for a general method to obtain derivative rules of a constrained matrix with respect to its matrix elements.


In the case of a symmetric matrix $S_{ij}$ (with $S_{ij}=S_{ji}$), one way to do that is the following (see Variation of the metric with respect to the metric). We say that a variation of a matrix element $\delta S_{ij}$ is the same than that of $\delta S_{ji}$, and thus $$ \delta S_{ij}=\frac{\delta S_{ij}+\delta S_{ji}}{2}=\frac{\delta_{ik}\delta_{jl}+\delta_{il}\delta_{jk}}{2}\delta S_{kl}=\mathcal S_{ij;kl}\delta S_{kl}. $$ The tensor $\mathcal S_{ij;kl}$ has the nice property that $\mathcal S_{ij;kl}\mathcal S_{kl;mn}=\mathcal S_{ij;mn}$. One then says that $$ \frac{\delta S_{ij}}{\delta S_{kl}}=\mathcal S_{ij;kl}. $$


I must admit that why this is the correct procedure is not quite clear to me (that seems to be quite arbitrary, although obviously it works to compute derivatives of a function of a symmetric matrix). This means that it is not clear to me how to generalize that when the constraint is different.


For example, let's take the set of matrices $O$ belonging to the group $SO(N)$. Is there a way to write $\frac{\delta O_{ij}}{\delta O_{kl}}$ in terms of a tensor $\mathcal B_{ij;kl}$, with all the same nice properties ?


In the case of $SO(2)$, this seems quite easy, since then $O_{ji}=(-1)^{i+j}O_{ij}$, and one finds in that case $$ \frac{\delta O_{ij}}{\delta O_{kl}}=\frac{\delta_{ik}\delta_{jl}+(-1)^{i+j}\delta_{il}\delta_{jk}}{2}, $$ which indeed does the job. Note however that I haven't use the defining property of $SO(N)$, that is $O O^T=1$, and I am not sure if this is relevant...


Already in the case of $SO(3)$, it does not seem to be easy to find the equivalent tensor...





Side note : using the defining property of $SO(2)$, one can massage the formulas to obtain $$ \frac{\delta O_{ij}}{\delta O_{kl}}=-O_{il}O_{kj}. $$ First of all, it depends explicitly on $O$, which seems bad. Furthermore, if we tentatively define $\mathcal B_{ij;kl}[O]=-O_{il}O_{kj}$ (which already is different from what we found for $SO(2)$), then we have $\mathcal B_{ij;kl}[O]\mathcal B_{kl;mn}[O]=\delta_{im}\delta_{jn}$, which seems pretty weird...




If anyone knows the standard procedure (if any exists) or a good reference, that would be greatly appreciated. In any case, a nice explanation (maybe a bit formal) in the case of symmetric matrix might also help me to get my head around the problem.



Answer





  1. Setup. Let there be given an $m$-dimensional manifold $M$ with coordinates $(x^1, \ldots, x^m)$. Let there be given an $n$-dimensional physical submanifold $N$ with physical coordinates $(y^1, \ldots, y^n)$. Let there be given $m-n$ independent constraints $$ \chi^1(x)~\approx~ 0,\quad \ldots,\quad \chi^{m-n}(x)~\approx~ 0, \tag{1}$$ which defines the physical submanifold $N$. [Here the $\approx$ symbol means weak equality, i.e. equality modulo the constraints.] Assume that $$(y^1, \ldots, y^n, \chi^1, \ldots,\chi^{m-n})\tag{2}$$ constitutes a coordinate system for the extended manifold $M$.




  2. Dirac derivative. In analogy with the Dirac bracket, let us introduce a Dirac derivative $$\left(\frac{\partial}{\partial x^i}\right)_{\! D} ~:=~\frac{\partial}{\partial x^i} -\sum_{a=1}^{m-n}\frac{\partial \chi^a}{\partial x^i}\left(\frac{\partial }{\partial \chi^a}\right)_{\! y} ~=~\sum_{\alpha=1}^n\frac{\partial y^{\alpha}}{\partial x^i} \left(\frac{\partial }{\partial y^{\alpha}}\right)_{\! \chi} , \qquad i~\in~\{1,\ldots, m\}, \tag{3}$$ that projects onto the physical submanifold $$\left(\frac{\partial}{\partial x^i}\right)_{\! D} y^{\alpha}~=~\frac{\partial y^{\alpha}}{\partial x^i} , \qquad \left(\frac{\partial}{\partial x^i}\right)_{\! D} \chi^a~=~0, $$ $$i~\in~\{1,\ldots, m\},\qquad \alpha~\in~\{1,\ldots, n\},\qquad a~\in~\{1,\ldots, m\!-\!n\}.\tag{4}$$





  3. Remark. In many important cases it is possible to choose the physical coordinates $(y^1, \ldots, y^n)$ such that the Dirac derivative (4) can be written as linear combinations of unconstrained partial $x$-derivatives only, without referring to the $(y,\chi)$-coordinate system (2), cf. eqs. (10) & (14) below.




  4. Does Dirac derivatives commute? Does the commutator $$\left[ \left(\frac{\partial}{\partial x^i} \right)_{\! D}, \left(\frac{\partial}{\partial x^j}\right)_{\! D}\right] ~=~\sum_{\alpha,\beta=1}^n\frac{\partial y^{\alpha}}{\partial x^i}\left[\left(\frac{\partial }{\partial y^{\alpha}}\right)_{\! \chi}, \frac{\partial y^{\beta}}{\partial x^j}\right] \left(\frac{\partial }{\partial y^{\beta}}\right)_{\! \chi} - (i\leftrightarrow j)~\stackrel{?}{\approx}~0 \tag{5}$$ vanishes weakly? Not necessarily. But if the coordinate transformation $x^i \leftrightarrow (y^{\alpha}, \chi^a)$ is linear, then the Dirac derivatives commute.




  5. Example. Let the physical subspace be the hyperplane $N=\{\chi(x)=0\}$ with the constraint $$\chi~=~\sum_{i=1}^m x^i.\tag{6}$$ Define physical coordinates $$y^{\alpha}~=~ x^{\alpha} -\frac{1}{m}\sum_{i=1}^m x^i, \qquad \alpha~\in~\{1,\ldots, n\!=\!m\!-\!1\}.\tag{7} $$ Conversely, $$ x^{\alpha}~=~y^{\alpha} +\frac{1}{m}\chi, \qquad \alpha~\in~\{1,\ldots, n\}, \qquad x^m~=~-\sum_{\beta=1}^n y^{\beta}+\frac{1}{m}\chi.\tag{8}$$ The derivatives are related as $$\frac{\partial}{\partial x^{\alpha}} ~=~\left(\frac{\partial }{\partial y^{\alpha}}\right)_{\! \chi} -\frac{1}{m} \sum_{\beta=1}^n\left(\frac{\partial }{\partial y^{\beta}}\right)_{\! \chi} +\left(\frac{\partial }{\partial \chi}\right)_{\! y}, \qquad \alpha~\in~\{1,\ldots, n\}, $$ $$\frac{\partial}{\partial x^m} ~=~ -\frac{1}{m} \sum_{\beta=1}^n\left(\frac{\partial }{\partial y^{\beta}}\right)_{\! \chi} +\left(\frac{\partial }{\partial \chi}\right)_{\! y}.\tag{9}$$ The Dirac derivative becomes after some algebra $$\left(\frac{\partial}{\partial x^i}\right)_{\! D} ~=~\frac{\partial}{\partial x^i} -\left(\frac{\partial }{\partial \chi}\right)_{\! y} ~=~\frac{\partial}{\partial x^i} -\frac{1}{m} \sum_{j=1}^m\frac{\partial}{\partial x^j},\qquad i~\in~\{1,\ldots, m\}.\tag{10}$$





  6. Example. Differentiation wrt. a symmetric matrix can be viewed as a Dirac differentiation (3), where the constraints (1) are given by antisymmetric matrices. Define $$ s_{(ij)}~:=~ \frac{M_{ij}+M_{ji}}{2} \quad\text{and}\quad a_{(ij)}~:=~ \frac{M_{ij}-M_{ji}}{2}\quad\text{for}\quad i~>~j; \quad\text{and}\quad d_{(i)}~:=~M_{ii}.\tag{11}$$ Conversely, $$ M_{ij}~=~\theta_{ij}(s_{(ij)}+a_{(ij)})+\theta_{ji}(s_{(ji)}-a_{(ji)}) + \delta_{ij}d_{(i)} ,\tag{12}$$ where the discrete Heaviside step function $\theta_{ij}$ here is assumed to obey $\theta_{ii}=0$ (no implicit sum). The derivatives are related as $$\frac{\partial}{\partial M_{ij}} ~=~\frac{\theta_{ij}}{2}\left(\frac{\partial}{\partial s_{(ij)}}+\frac{\partial}{\partial a_{(ij)}}\right)+\frac{\theta_{ji}}{2}\left(\frac{\partial}{\partial s_{(ji)}}-\frac{\partial}{\partial a_{(ji)}}\right) + \delta_{ij}\frac{\partial}{\partial d_{(i)}} .\tag{13}$$ The Dirac derivative becomes after some algebra $$ \left(\frac{\partial}{\partial M_{ij}}\right)_{\! D}~=~\frac{\theta_{ij}}{2}\frac{\partial}{\partial s_{(ij)}}+\frac{\theta_{ji}}{2}\frac{\partial}{\partial s_{(ji)}} + \delta_{ij}\frac{\partial}{\partial d_{(i)}} ~=~\frac{1}{2} \left(\frac{\partial}{\partial M_{ij}} +\frac{\partial}{\partial M_{ji}}\right).\tag{14}$$




  7. Remark. Additional complications arise if the coordinates and/or constraints are not globally defined. For starters, it is actually enough if (2) is a coordinate system in a tubular neighborhood of $N$.




  8. Reparametrizations of the constraints. Assume that there exists a second coordinate system $$(\tilde{y}^1, \ldots, \tilde{y}^n, \tilde{\chi}^1, \ldots,\tilde{\chi}^{m-n})\tag{15}$$ (which we adorn with tildes), such that $$\tilde{y}^{\alpha}~=~f^{\alpha}(y), \qquad \tilde{\chi}^a ~=~ g^a(y,\chi)~\approx~0.\tag{16}$$ This implies that $$ \left(\frac{\partial }{\partial \chi^a}\right)_{\! y} ~=~ \left(\frac{\partial \tilde{\chi}^b}{\partial \chi^a}\right)_{\! y} \left(\frac{\partial }{\partial \tilde{\chi}^b}\right)_{\! \tilde{y}}, \qquad \left(\frac{\partial }{\partial y^{\alpha}}\right)_{\! \chi} ~\approx~\left(\frac{\partial \tilde{y}^{\beta}}{\partial y^{\alpha}}\right)_{\! \chi} \left(\frac{\partial }{\partial \tilde{y}^{\beta}}\right)_{\! \tilde{\chi}}, \tag{17}$$ i.e $$\Delta_{\chi}~:=~{\rm span}\left\{\left(\frac{\partial }{\partial \chi^1}\right)_{\! y}, \ldots ,\left(\frac{\partial }{\partial \chi^{n-m}}\right)_{\! y}\right\} ~\subseteq~ TM\tag{18}$$ is an involutive distribution, while $$\Delta_y~:=~{\rm span}\left\{\left(\frac{\partial }{\partial y^1}\right)_{\! \chi}, \ldots ,\left(\frac{\partial }{\partial y^n}\right)_{\! \chi}\right\} ~\subseteq~ TM\tag{19}$$ is a weak distribution.


    One may show that the Dirac derivative and its commutators $$ \left(\frac{\partial}{\partial x^i}\right)^{\sim}_{\! D}~\approx~\left(\frac{\partial}{\partial x^i}\right)_{\! D}, \qquad \left[ \left(\frac{\partial}{\partial x^i} \right)^{\sim}_{\! D}, \left(\frac{\partial}{\partial x^j}\right)^{\sim}_{\! D}\right]~\approx~ \left[ \left(\frac{\partial}{\partial x^i} \right)_{\! D}, \left(\frac{\partial}{\partial x^j}\right)_{\! D}\right], \tag{20}$$ [wrt. the tilde and the untilde coordinate systems (15) and (2), respectively] agree weakly. This shows that the Dirac derivative (3) is a geometric construction.





  9. Subsubmanifold. Given a $p$-dimensional physical subsubmanifold $P$ with physical coordinates $(z^1,\ldots,z^p)$. Let there be given $n-p$ independent constraints $$ \phi^1(y)~\approx~ 0,\quad \ldots,\quad \phi^{n-p}(y)~\approx~ 0, \tag{21}$$ which defines the physical submanifold $P$. Assume that $$(z^1, \ldots, z^p, \phi^1, \ldots,\phi^{n-p})\tag{22}$$ constitutes a coordinate system for the submanifold $N$. One may show that $$\left(\frac{\partial}{\partial x^i}\right)^{\!(P)}_{\! D} ~=~\left(\frac{\partial}{\partial x^i}\right)^{\!(N)}_{\! D} - \sum_{a=1}^{n-p}\left(\frac{\partial \phi^a}{\partial x^i}\right)^{\!(N)}_{\! D}\left(\frac{\partial }{\partial \phi^a}\right)_{\!z}, \qquad i~\in~\{1,\ldots, m\}. \tag{23}$$ This shows that the Dirac derivative construction behaves naturally wrt. further constraints.




No comments:

Post a Comment

Understanding Stagnation point in pitot fluid

What is stagnation point in fluid mechanics. At the open end of the pitot tube the velocity of the fluid becomes zero.But that should result...