I'm working through an introductory electrodynamics text (Griffiths), and I encountered a pair of questions asking me to show that:
- the divergence transforms as a scalar under rotations
- the gradient transforms as a vector under rotations
I can see how to show these things mathematically, but I'd like to gain some intuition about what it means to "transform as a" vector or scalar. I have found definitions, but none using notation consistent with the Griffiths book, so I was hoping for some confirmation.
My guess is that "transforms as a scalar" applies to a scalar field, e.g. $T(y,z)$ (working in two dimensions since the questions in the book are limited to two dimensions). It says that if you relabel all of the coordinates in the coordinate system using: $$\begin{pmatrix}\bar{y} \\ \bar{z}\end{pmatrix} = \begin{pmatrix}\cos\phi & \sin\phi \\ -\sin\phi & \cos\phi\end{pmatrix} \begin{pmatrix}y \\ z\end{pmatrix}$$ so $(\bar{y},\bar{z})$ gives the relabeled coordinates for point $(y,z)$, then: $$\bar{T}(\bar{y},\bar{z}) = T(y,z)$$ for all y, z in the coordinate system, where $\bar{T}$ is the rotated scalar field. Then I thought perhaps I'm trying to show something like this? $$\overline{(\nabla \cdot T)}(\bar{y},\bar{z})=(\nabla \cdot T)(y,z) $$ where $\overline{(\nabla \cdot T)}$ is the rotated gradient of $T$.
The notation above looks strange to me, so I'm wondering if it's correct. I'm also quite curious what the analogous formalization of "transforms as a vector field" would look like.
Answer
There are a number of ways of mathematically formalizing the notions "transforming as a vector" or "transforming as a scalar" depending on the context, but in the context you're considering, I'd recommend the following:
Consider a finite number of types of objects $o_1, \dots, o_n$, each of which lives in some set $O_i$ of objects, and each of which is defined to transform in a particular way under rotations. In other words, given any rotation $R$, and for each object $o_i$ we have a mapping when acting on objects in $O_i$ tells us what happens to them under a rotation $R$: \begin{align} o_i \mapsto o_i^R = \text{something we specify} \end{align} For example, if $o_1$ is just a vector $\mathbf r$ in three dimensional Euclidean space $\mathbb R^3$, then one would typically take \begin{align} \mathbf r \mapsto \mathbf r^R = R\mathbf r. \end{align} Each mapping $o_i\mapsto o_i^R$ is what a mathematician would call a group action of the group of rotations on the set $O_i$ (there are more details in defining a group action which we ignore here). Once we have specified how these different objects $o_i$ transform under rotations, we can make the following definition:
Definition. Scalar under rotations
Let any function $f:O_1\times O_2\times\cdots \times O_n\to \mathbb R$ be given, we say it is a scalar under rotations provided \begin{align} f(o_1^R, \dots o_n^R) = f(o_1, \dots o_n). \end{align} This definition is intuitively just saying that if you "build" an object $f$ out of a bunch of other objects $o_i$ whose transformation under rotations you have already specified, then the new object $f$ which you have constructed is considered a scalar if it doesn't change when you apply a rotation to all of the objects it's built out of.
Example. The dot product
Let $n=2$, and let $o_1 = \mathbf r_1$ and $o_2 = \mathbf r_2$ both be vectors in $\mathbb R^3$. We define $f$ as follows: \begin{align} f(\mathbf r_1, \mathbf r_2) = \mathbf r_1\cdot \mathbf r_2. \end{align} Is $f$ a scalar under rotations? Well let's see: \begin{align} f(\mathbf r_1^R, \mathbf r_2^R) = (R\mathbf r_1)\cdot (R\mathbf r_2) = \mathbf r_1\cdot (R^TR\mathbf r_2) = \mathbf r_1\cdot \mathbf r_2 = f(\mathbf r_1, \mathbf r_2) \end{align} so yes it is!
Now what about a field of scalars? How do we define such a beast? Well we just have to slightly modify the above definition.
Definition. Field of scalars
Let any function $f:O_1\times\cdots \times O_n\times\mathbb R^3\to \mathbb R$ be given. We call $f$ a field of scalars under rotations provided \begin{align} f(o_1^R, \dots, o_n^R)(R\mathbf x) = f(\mathbf x). \end{align} You can think of this as simply saying that the rotated version of $f$ evaluated at the rotated point $R\mathbf x$ agrees with the unrotated version of $f$ evaluated at the unrotated point. Notice that this is formally the same as the equation you wrote down, namely $\bar T(\bar x, \bar y) = T(x,y)$.
Example. Divergence of a vector field
Consider the case that $\mathbf v$ is a vector field. Rotations are conventionally defined to act on vector fields as follows (I'll try to find another post on physics.SE that explains why): \begin{align} \mathbf v^R(\mathbf x) = R\mathbf v(R^{-1}\mathbf x) \end{align} Is its divergence a scalar field? Well to make contact with the definition we give above, let $f$ denote the divergence, namely \begin{align} f(\mathbf v)(\mathbf x) = (\nabla\cdot \mathbf v)(\mathbf x) \end{align} Now notice that using the chain rule we get (we use Einstein summation notation) \begin{align} (\nabla\cdot\mathbf v^R)(\mathbf x) &= \nabla\cdot\big(R\mathbf v(R^{-1}\mathbf x)\big)\\ &= \partial_i(R_{ij}v_j(R^{-1}\mathbf x) \\ &= R_{ij} \partial_i(v_j(R^{-1}\mathbf x)) \\ &= R_{ij}(R^{-1})_{ki}(\partial_k v_j)(R^{-1}\mathbf x)\\ &= (\nabla\cdot \mathbf v)(R^{-1}\mathbf x) \end{align} which implies that \begin{align} (\nabla\cdot\mathbf v^R)(R\mathbf x) = (\nabla\cdot \mathbf v)(\mathbf x), \end{align} but the left hand side is precisely $f(\mathbf v^R)(R\mathbf x)$ and the right side is $f(\mathbf v)(\mathbf x)$ so we have \begin{align} f(\mathbf v^R)(R\mathbf x) = f(\mathbf v)(\mathbf x). \end{align} This is precisely the condition that $f$ (the divergence of a vector field) be a scalar field under rotations.
Extension to vectors and vector fields.
To define a vector under rotations, and a field of vectors under rotations, we do a very similar procedure, but instead we have functions $\mathbf f:O_1\times O_2\times\cdots \times O_n\to \mathbb R^3$ and $\mathbf f:O_1\times O_2\times\cdots \times O_n\times\mathbb R^3\to \mathbb R^3$ respectively (in other words the right hand side of the arrow gets changed from $\mathbb R$ to $\mathbb R^3$, and the defining equations for a vector and a field of vectors become \begin{align} \mathbf f(o_1^R, \dots o_n^R) = R\,\mathbf f(o_1, \dots o_n). \end{align} and \begin{align} \mathbf f(o_1^R, \dots, o_n^R)(R\mathbf x) = R \,\mathbf f(\mathbf x) \end{align} respectively. In other words, there is an extra $R$ multiplying the right hand side.
No comments:
Post a Comment