May 2018

Thursday, 31 May 2018

electrostatics - Electric Flux - What is the point?

Electric flux is a defined quantity that is proportional to the no. of field lines passing through a given area element for a given electric field. It is not proportional to the relative density of field lines, which would supply information regarding the strength of the field at that point. Electric flux, it seems to me, does not supply us with any practical information. It seems to me that electric flux is a quantity defined and modeled specifically for Gauss' law, to introduce some kind of mathematical elegance to it and to introduce an additional visual aspect to the concept of electric fields. Perhaps this is why for symmetrical situations especially, Gauss' Law can be used to easily determine the electric field due to the given charge distribution. Am I wrong here? Is there a physical significance to electric flux that I do not understand? Thanks for your answers.

newtonian mechanics - Why doesn't Newton's third law mean a person bounces back to where they started when they hit the ground?

Why doesn't a person bounce back after falling down like a ball does? If we push a person and he falls down then why doesn't he come back to his initial position?

According to Newton's 3rd law of motion: To every action there is always an equal but opposite reaction. If we take the example of ball then it comes back with the same force as it falls down. But in the case of a human body, this law is not applicable. Why?

condensed matter - Symmetry Breaking And Phase transition

Is every phase transition associated with a symmetry breaking? If yes, what is the symmetry that a gaseous phase have but the liquid phase does not?

What is the extra symmetry that normal $\bf He$ has but superfluid $\bf He$ does not? Is the symmetry breaking, in this case, a gauge symmetry breaking?

Update Unlike gases, liquids have short-range order. Does it not mean that during the gas-to-liquid transition, the short-range order of liquids breaks the translation symmetry? At least locally?

Answer

Let me answer your first question: Phase transitions do not necessarily imply a symmetry breaking. This is clear in the example your are mentioning : The liquid-gas transition is characterized by a first order phase transition but there is no symmetry breaking. Indeed, liquid and gas share the same symmetry (translation and rotation invariance) and may be continuously connected in the high temperature/pressure regime. In quantum systems at zero-temperature, one may also encounter transition in between quantum spin-liquid states for which there is also no symmetry breaking. Yet another example is the case of the 2D XY model where there is a continuous phase transition but there is no symmetry breaking (Kosterlitz-Thouless transition).

Wednesday, 30 May 2018

homework and exercises - Bouncing ball simulation computer science

In my Computer science class I was given a problem where I have to simulate a bouncing ball using "real physics". I have been trying to find a equation that will simulate the height of the bounce given a gravity and an arbitrary mass. And I will need to calculate the next bounce and it's height. A lot of the equations I've found require a time. But I don't really care about a time, all I want is to get the next height after the previous bounce until it finally hits height of 0 or close to it. I haven't taken a physics class since high school( 5 years ago) and that was basic physics.

fluorescence - If blackbody radiation at 6000K peaks in the optical, why aren't fluorescent bulbs at 6000K?

We know, via Wien's law, that a body at 6000K emits an electromagnetic wave at the peak wavelength in the visible spectrum.

How come say the fluorescent tubes which also emit the EM waves that we can see as visible not 6000K?

Answer

Black body, by definition, produces thermal radiation only, which is an EM radiation caused by heat. For such radiation, the temperature of a body defines its radiation spectrum and its peak.

The EM radiation in fluorescent tube is not due to heat, but due to fluorescence, which is a type luminescence, defined as emission of light not caused by heat, but by other processes.

More specifically, in a fluorescent tube, UV photons are emitted by mercury vapor atoms, excited by fast moving charge carriers (sort of electroluminescence), and then visible light photons are emitted by phosphor coating atoms, excited by UV photons (fluorescence). Both steps here are forms of luminescence, not thermal radiation.

Since fluorescent light is not due to thermal radiation, its temperature is not governed by black body radiation curves. Therefore, even though most of the EM radiation emitted by a fluorescent tube is in the visible light spectrum, its temperature is very low.

Tuesday, 29 May 2018

quantum field theory - Normal Order of Normal Order

in first volume of Polchinski page 39 we can read a compact formula to perform normal-order for bosonic fields $$ :\cal F:=\underbrace{\exp\left\{\frac{α'}{4}∫\mathrm{d}^2z\mathrm{d}^2w\log|z-w|^2\frac{δ}{δφ(z,\bar z)}\frac{δ}{δφ(w,\bar zw)} \right\}}_{:=\mathcal{O}}\cal F, \tag{1} $$

What I do not understand it is that I would like to have (bearing in mind the definition involving $a$ and $a^†$ $$ ::\cal F::=:\cal F:\tag{2} $$ but with this formula $$ \cal O^2\cal F≠\cal O \cal F.\tag{3} $$

EXAMPLE

$$ :φ(z)φ(w):=φ(z)φ(w)-\frac{α'}{2}\log|z-w|^2\tag{4} $$ but $$ ::φ(z)φ(w)::=:φ(z)φ(w):-\frac{α'}{2}\log|z-w|^2=φ(z)φ(w)-α'\log|z-w|^2\tag{5} $$

Answer

Short explanation: Polchinski's eq. (1) is not a formula that transforms no normal order into normal order: The expression ${\cal F}$ on the right-hand side of eq. (1) is implicitly assumed to be radially ordered. In fact, eq. (1) is a Wick theorem for changing radial order into normal order, cf. e.g. this Phys.SE post.

Longer explanation: When dealing with non-commutative operators, say $\hat{X}$ and $\hat{P}$, the "function of operators" $f(\hat{X},\hat{P})$ does not make sense unless one specifies an operator ordering prescription (such as, e.g., radial ordering, time-ordering, Wick/normal ordering, Weyl/symmetric ordering, etc.). A more rigorous way is to introduce a correspondence map $$\begin{array}{c} \text{Symbols/Functions}\cr\cr \updownarrow\cr\cr\text{Operators}\end{array}\tag{A}$$ (E.g. the correspondence map from Weyl symbols to operators is explained in this Phys.SE post.) To define an operator $\hat{\cal O}$ on operators, one often give the corresponding operator ${\cal O}$ on symbols/functions, i.e., $$ \begin{array}{ccc} \text{Normal-Ordered Symbols/Functions}&\stackrel{\cal O}{\longrightarrow} & \text{Radial-Ordered Symbols/Functions} \cr\cr \updownarrow &&\updownarrow\cr\cr \text{Normal-Ordered Operators}&\stackrel{\hat{\cal O}}{\longrightarrow} & \text{Radial-Ordered Operators}\end{array}\tag{B}$$ E.g. Polchinski's differential operator ${\cal O}$ does strictly speaking only make sense if it acts on symbols/functions. The identification (A) of symbols and operators is implicitly implied in Polchinski.

Concerning idempotency of normal ordering, see also e.g. this related Phys.SE post.

newtonian gravity - Gravitational field intensity inside a hollow sphere

It is quite easy to derive the gravitational field intensity at a point within a hollow sphere. However, the result is quite surprising. The field intensity at any point within a hollow sphere is zero.

What exactly is the reason behind this? Except for, of course, the mathematics behind it. Is there any logic why the field intensity should be zero within a sphere? For example, it is logical to say that the field intensity would be zero at the center, as all the intensities cancel out. However, this cannot be the case for any point within the sphere.

Answer

One intuitive way I've seen to think about the math is that if you are at any position inside the hollow spherical shell, you can imagine two cones whose tips are at your position, and which both lie along the same axis, widening in opposite direction. Imagine, too, that they both subtend the same solid angle, but the solid angle is chosen to be infinitesimal. Then you can consider the little chunks of matter where each cone intersects the shell, as in the diagram on this page:

enter image description here

You still need to do a bit of geometric math, but you can show that the area of each red bit is proportional to the square of the distance from you (the blue point) to it--and hence the mass of each bit is also proportional to the square of the distance, since we assume the shell has uniform density. But gravity obeys an inverse-square law, so each of those two bits should exert the same gravitational pull on you, but in opposite directions, meaning the two bits exert zero net force on you. And you can vary the axis along which the two cones are drawn so that every point on the surface of the shell ends up being part of a pair like this, which leads to the conclusion that the entire spherical shell exerts zero net force on you.

electromagnetism - Why is TV remote's infrared wave shown by a camera as purple light, why not red or something else?

I've always seen that all the old model remotes that is used in DVD players or TVs, it emits electromagnetic waves, having wavelength somewhere in the Infrared region. But when my phone's average quality camera (Moto E) is held between the emitter and my eyes, it converts the infrared into visible ones and shows a whitish purple colour. I don't know why always the purple is shown. I'll be very happy if you explain this to me. Thank you.

newtonian mechanics - How can kinetic energy be proportional to the square of velocity, when velocity is relative?

Let's start with kinetic energy (from los Wikipedias)

The kinetic energy of an object is the energy which it possesses due to its motion. It is defined as the work needed to accelerate a body of a given mass from rest to its stated velocity. Having gained this energy during its acceleration, the body maintains this kinetic energy unless its speed changes. The same amount of work is done by the body in decelerating from its current speed to a state of rest. In classical mechanics, the kinetic energy of a non-rotating object of mass m traveling at a speed $v$ is $\frac{1}{2}mv^2$.

Let's say you & your bike have mass of 100kgs, then your kinetic energy at 10m/s would be

$$ E_a = 1/2 \times 100 \times 10^2 = 5000J = 5kJ$$

If you apply another 5kJ of energy, you don't get to 20m/s though, you only get to:

$$ E_b = 10000J =1/2 \times 100 \times V_b^2$$ $$\implies V_b = \sqrt{10000 / (1/2 \times 100)} = √200 = 14.14m/s$$

Let's say you and a buddy are both coasting along at 10m/s though, from their perspective you've just burned 5kJ but only accelerated 4.1m/s, even though you seemed stationary.

Imagine you and your mate are in space drifting along together, at an unknown speed. Your mate fires his burners and accelerates away from you. There's a big screen on his ship showing how many joules of energy he just burned, and you can measure his resulting relative velocity just fine.

The question is, Will 5kJ of energy always produce 10m/s of relative velocity,,assuming 100kg spaceships?

If 5kJ always produces 10m/s, Why does the second 5kJ only produce 4.1m/s? What is going on here?

homework and exercises - The position of a particle at any time $t$ is given by $S = V0/a [1-e^{-at}]$. What are the dimensions of $a$ and $V_0$?

To find the dimensions of and V0, I must know the dimension of S and e. So I want to know it.

Monday, 28 May 2018

special relativity - Why $c$ is $3times10^8$ times faster than a $1 mathrm{m/s}$ car?

The value of a dimensional constant like $c$ is often regarded as unimportant since it can be arbitrarily changed to any desired value by changing our units. For example, $c=3\times10^8$ in $\mathrm{m/s}$, and $c=1$ in $\mathrm{ft/ns}$. The value of a dimensionless constant on the other hand, is significant since it's independent of our metric system (e.g. the ratio between the mass of a proton to that of an electron). My question is not the typical "Why is $c=3\times10^8\ \mathrm{m/s}$?" one, but the dimensionless version of it.

Say you're working in SI, you measure a certain car speed with respect to the ground of the earth to be $v=1\ \mathrm{m/s}$, now: $c/v=3\times10^8$. This ratio between $c$ and the speed of this particular car is dimensionless, therefore it's independent of our metric system. Now someone might argue that this ratio is not important since we can change it to any value we desire by changing our frame of reference. Indeed you can change the value of the ratio by changing our frame of reference, However the particular value $c/v=3\times10^8$ remains the same (independent of any metric system) if we stick to one frame (with respect to the ground of earth).

In this sense I ask why $c$ (the upper limit to the speed of any physical object) is $3\times10^8$ times faster than that car (confining ourselves to one frame of reference)? Why $c$ is not some other $x$ times faster or slower? Is there some fundamental reason behind this or is it just an empirical fact, so that it's possible that we can have infinite universes just like ours but with different ratio ($c/v$)?

[Edit]: to put it another way, the crux of my inquiry is this: it's quite conceivable that we could have been living in another universe that is identical with ours, and it has some $c$ as their upper limit, However the value of $c_\text{our universe}$ is different from $c_\text{another universe}$ (in the same metric system), so why we live in a universe with $c_\text{our universe}$ but not another value?

Answer

If you ask why $c/v$ is equal to 299,792,458 (this is the right value!), the question obviously depends both on $c$ and $v$. We know that the speed of light is 299,792,458 m/s because that follows from the modern definition of one meter adopted 30+ years ago: one meter is exactly so long that the speed of light in the vacuum is 299,792,458 m/s. This random numerical value was fixed to the integer value approximately measured by 1980 when the error margin was 1.2 m/s. Previously, the numerical value came from the definitions of one second through the solar day, and one meter defined as a fraction of the circumference of a meridian (40,000 km).

One second is defined through a periodic process used in some atomic clocks.

However, your question also depends on $v$, the speed of "the" car, and it's the much less well-defined aspect of your question. Because $c$ is such a robust, well-known, universal constant of physics, your question is basically equivalent to the question "Why you chose the car and its $v$ to be 1 m/s?" This is a particularly good question because if a company produced cars whose speed is just 1 m/s, it wouldn't sell any (except if they were toys for children).

There exists no "canonical car" in the laws of physics whose speed would be $v=1$ m/s and to pretend that this value of the speed has a canonical meaning is equivalent, as dmckee pointed out, to pretending that the dimensionful quantity $c$ is dimensionless. Well, $c$ is not dimensionless, and the ludicrous implicit claims that all cars should have $v=1$ m/s is just another way to see the big difference between dimensionful and dimensionless quantities. 1 m/s is a dimensional quantity, a nontrivial unit, and calling it "the speed of the car" doesn't make it any more dimensionless.

One may ask a similar question. Let's fix the description of 1 m/s as the speed of a walking child instead of a "car", to be more realistic. So the question is why the speed of walking children is about $c/(3\times 10^8)$? Again, the answer depends on the definition of "walking children". What exactly "children" are depends on all the details of life on Earth and the whole evolution of species and the definition of humans, a particular species, as one of the branches of evolution. Different species' infants have significantly different speeds.

So the corrected question is a messy interdisciplinary question in biology, astrophysics, geology, and other fields. There exists absolutely no reason why there should exist a simple answer or even an "exact calculation" of the ratio – obviously, different children have different speeds, anyway. Physics may say that the ratio $c/v$ must be greater than one (special relativity). And it must be substantially greater than one if the walking children don't ionize each other when they hit other children. The speed of light is about 137 times greater than the speed of the electron in the hydrogen atom – a more interesting ratio. It is no coincidence that 137 is the fine-structure constant.

But one must understand that the speed of light $c$ is the normal, canonical, universal speed among the two, while the speed of the walking child or the non-existent superslow hypothetical car $v$ is the contrived part of the ratio, one that doesn't really belong to physics, at least not fundamental physics.

measurements - How do we measure mass?

How do you measure mass? Weight is easy using a scale, but we can't measure mass that way, because then mass would be different on every planet. I know there was a Veritasium video (here) on defining what, exactly, one kilogram was, but they can only define that if they know some previous measurement (i.e., one cube of metal is 2kg)!

Sunday, 27 May 2018

optics - Huygens-Fresnel principle

Huygens principle states that

1) Every point on a given wavefront may be considered as a source of secondary wavelets which spread out with the speed of light in that medium. 2) The new wavefront is the forward envelope of secondary wavelet at that instant.

What does this actually mean in both theoretical and practical view? Does this explain in this way: Taking sun as a source, As the light from sun reaches every object, Do each & every object actually act as a new source of that light"..? Or a ray of light inside a room causes its walls to reflect the light. Then, do the walls act as a new source?

Explanation and examples related to both principles would be helpful...

Answer

One thing that the Huygens principle means, and I think it is this that you're getting at, is that the propagation of a wavefront is independent of where the light actually came from.

So yes, when a ray of light inside a room reflects off the walls, the walls act as a new source. If the original light source didn't exist, but the walls emitted a wavefront with exactly the same amplitude and phase as before, then you wouldn't be able to tell the difference.

Can the technology behind Particle accelerators can be used for energy storage?

As I understand, the kinetic energy of the proton beam in a hadron collider is quite large. Can you build a energy storage system that is based on accelerating a proton beam to relativistic speeds to retrieve the energy later?

gravity - Earth's gravitational waves? Measurable?

Why don't they measure Earth's gravitational waves which has stronger effect holding us on planet and sounds to me maybe stronger instead of measuring far black hole mass created wave 1/1000th of proton size change in spacetime?

cosmology - According to Hubble's Law, how can the expansion of the Universe be accelerating?

Scientists today think the expansion of the universe is accelerating.

According to Hubble's law, objects further away are moving faster than objects closer to us. The further away an object is, the further back in time we are seeing, so in the past objects moved faster (is this sentence correct?).

So because objects moved faster before than they do now, surely that's deceleration not acceleration?

my textbook says "if Scientists today think the rate of expansion of the Universe were decreasing then distant objects should appear different to Hubble's Law predictions: universe is accelerating. More distant objects would seem to be receding faster (since expansion was faster in the past)"

potential - What maintains constant voltage in a battery?

I know there's lots of questions that address similar situations, (Batteries connected in Parallel, Batteries and fields?, Naive Question About Batteries, and the oft-viewed I don't understand what we really mean by voltage drop).

However, I have a question, and after examining just the battery structure, I have been wondering, exactly what structure/process maintains the constant voltage drop within batteries? I mean, certain chemical reactions are occurring in each half-cell, and electrolytes maintain charge conservation, I get that there's some motivation for the electron to move from one cell to the other.

But why is this motivation so constant? I really want to get this.

Answer

Consider for a moment, a cell that is not connected to a circuit, i.e., there is no path for current external to the cell.

The chemical reactions inside the cell remove electrons from the cathode and add electrons to the anode.

Thus, as the chemical reactions proceed, an electric field builds between the anode and cathode due to the differing charge densities.

It turns out that this electric field acts to reduce the rate of the chemical reactions within the cell.

At some point, the electric field is strong enough to effectively stop the chemical reactions within the cell.

The voltage across the terminals of the cell, due to this electric field, is then constant and this is the open-circuit voltage of the cell.

If an external circuit is connected to the cell, electrons flow from the anode through the external circuit and into the cathode, reducing the difference in charge densities which in turn reduces the electric field just enough such that the chemical reactions can once again take place to maintain the electric current through the circuit.

The larger the external current, the greater the required rate of chemical reactions and thus, the lower the voltage across the terminals.

As long as the circuit current is significantly less than the maximum current the chemicals reactions can sustain, the voltage across the battery terminals will be close to the open circuit voltage.

As the external current approaches the maximum current, the voltage across the terminals rapidly falls and when the voltage is zero, the cell is supplying maximum current. This current is called the short-circuit current.

Saturday, 26 May 2018

thermodynamics - How large would the steam explosion at Chernobyl have been?

So the second episode of the HBO series began to cover the risk of a steam explosion that led to them sending three divers into the water below the reactor to drain the tanks.

This occurred after the initial explosion that destroyed the reactor, and after the fire in the core had been put out. But at this point the decay heat and remaining fission reaction kept the core at more than 1200°C, causing it to melt through the concrete floors below the reactor.

And below the reactor were water tanks which contained 7,000 cubic meters of water (according to the TV show. If anyone has a real figure, I'd love to hear). When the lava of the melted core hit it, it would cause an enormous steam explosion.

Finally, my question: About how large would this explosion have been? The character in the show says "2-4 megatons" (of TNT equivalent, I assume). I'm pretty sure this is absurd and impossible. But real estimates are hard to come by. Other sources vary wildly, some repeating the "megatons" idea, and others saying it would've "level[ed] 200 square kilometers". This still seems crazy.

tl;dr:

I know a lot of it hinges on unknowns and the dynamics of the structures and materials involved, so I can simplify it to a constrained physics question:

Assuming 7,000 cubic meters of water instantly flashes to steam, how much potential energy is momentarily stored in that volume of steam occupying the same volume as the water did?

I don't know what to assume the temperature of the steam is. There were hundreds of tons of core material at temperatures near 1200°C, so worst case scenario you could assume all the steam becomes that temperature as the materials mix. Best case scenario, I guess we could assume normal atmospheric boiling point (100°C)?

Answer

In my view the water isn't really the thing to focus on here. The real energy reservoir was the partially-melted core; the water wasn't dangerous because it held energy, but rather because it had the potential to act as a heat engine and convert the thermal energy in the core into work. We can therefore calculate the maximum work which could conceivably be extracted from the hot core (using exergy) and use this as an upper bound on the amount of energy that could be released in a steam explosion. The exergy calculation will tell us how much energy an ideal (reversible) process could extract from the core, and we know from the Second Law of Thermodynamics that any real process (such as the steam explosion) must extract less.

Calculation

Using exergy, the upper bound on the amount of work which could be extracted from the hot core is

\begin{align} W_\text{max,out} &= X_1 - X_2 \\ &= m(u_1 - u_2 -T_0(s_1-s_2)+P_0(v_1-v_2)) \end{align} If we assume that the core material is an incompressible solid with essentially constant density, then \begin{align} W_\text{max,out} &= m(c (T_1 - T_2) -T_0 c \ln(T_1/T_2)) \end{align} where $T_0$ is the temperature of the surroundings, $T_2$ is the temperature after energy extraction is complete, and $T_1$ is the initial temperature. At this point you just need to choose reasonable values for the key parameters, which is not necessarily easy. I used:

$T_1 = 2800\,^\circ\text{C}$ based on properties of corium

$T_2 = T_0$ as an upper bound (the most energy is extracted when the system comes to the temperature of the surroundings)

$T_0 = 25\,^\circ\text{C}$ based on SATP

$c = 300\,\text{J/(kg.K)}$ based on properties of UO$_2$

$m = 1000\,\text{tonnes}$ based on the text in your question.

This gives me $W_\text{max,out} = 6.23 \times 10^{11}\,\text{J}$ or 149 tonnes of TNT equivalent. This is several orders of magnitude lower than the "megatons" estimate provided in your question, but does agree with your gut response that "megatons" seems unreasonably high. A sanity check is useful to confirm that my result is reasonable...

Sanity Check

With the numbers I used, the system weights 1 kiloton and its energy is purely thermal. If we considered instead 1 kiloton of TNT at SATP, the energy stored in the system would be purely chemical. Chemical energy reservoirs are generally more energy-dense than thermal energy reservoirs, so we'd expect the kiloton of TNT to hold far more energy than the kiloton of hot core material. This suggests that the kiloton of hot core material should hold far less than 1 kiloton of TNT equivalent, which agrees with your intuition and my calculation.

Limitations

One factor which could increase the maximum available work would be the fact that the core was partially melted. My calculation neglected any change in internal energy or entropy associated with the core solidifying as it was brought down to ambient conditions; in reality the phase change would increase the maximum available work. The other source of uncertainty in my answer is the mass of the core; this could probably be deduced much more precisely from technical documents. A final factor that I did not consider is chemical reactions: if the interaction of corium, water, and fresh air (brought in by an initial physical steam explosion) could trigger spontaneous chemical reactions, then the energy available could be significantly higher.

Conclusion

Although addressing the limitations above would likely change the final upper bound, I doubt that doing so could change the bound by the factor of ten thousand required to give a maximum available work in the megaton range. It is also important to remember that, even if accounting for these factors increased the upper bound by a few orders of magnitude, this calculation still gives only an upper bound on the explosive work; the real energy extracted in a steam explosion would likely be much lower. I am therefore fairly confident that the megaton energy estimate is absurd, as your intuition suggested.

How quantum field transforms in case of some particular spin

Except when a particle is spin-0, field of all particles transforms when frame of reference is changed, and this defines what spin is. The question is, specifically how does the quantum field transform in the corresponding equation, and how does this relate to the quantity of spin? I want to see how this works mathematically (specifically, I want to know spin-1/2 and spin-1, obviously.)

Friday, 25 May 2018

classical mechanics - Blowing your own sail?

guy with umbrella on skateboard

How it this possible? Even if the gif is fake, the Mythbusters did it and with a large sail it really moves forward. What is the explanation?

homework and exercises - Distribution of normal force on a book resting on the edge of a table

Assume a uniform book is resting on a table. The book has length $1$. But not the entire book is resting on the table, a bit $\alpha$ is sticking out of the table.

Obviously, due to experience, we get that if $0\leq\alpha\leq 1/2$, then the book will just remain at rest. If $\alpha>1/2$, then the book will experience a nonzero torque and eventually fall.

But how do we obtain this same answer with physics? Here's a first attempt that apparently does not work out. Assume for simplicity that $\alpha=1/4$, then $3/4$ is resting on the table.

The book is experiencing a normal force and a weight force. To find the weight force, it is sufficient to take a weight force at the center of the book of magnitude $mg$. The contribution to the torque is then $mg(1/4)$ (since the center of mass is $(1/4)$ away from the rim of the table). But how to find the torque coming from the normal force? by Newton's law, the magnitude of the normal force equals the magnitude of the weight, namely $mg$. Now I try to split up the book into infinitesimal components and integrate. I obtain $$\int_0^{3/4} mgxdx = \frac{9mg}{16}$$ This does not at all equal the contribution of the torque, $mg/4$. So there will be a nonzero torque, which can't happen.

So what is going wrong? Is the normal force somehow not uniform over the surface of the table? If it is nonuniform, how is it distributed, and how do I solve this problem as above using integration?

By the way, just for information, I do know how to solve this problem the following way: if the book is somehow a lever resting on just the rim (and not the table), then we can solve this problem ver easily using the law of the lever and such. When we do this, we find eventually that if $0\leq\alpha\leq 1/2$, then the torque must be "negative". The constraint force of the table prohobits this and just sets the torque at $0$. I'm sorry if this doesn't make much sense, but it isn't my main question anyway. My question is why the naive integration answer does not work and if the normal force is somehow not uniform.

quantum mechanics - Why do populations only change in second order of the driving field?

In the field of quantum optics when solving master equations it is well know that the populations¹ are constants to linear order in the driving field. I.e. weak driving fields will only affect the coherences off the system.

I am interested in how this statement can be made more precise and what the most general system is that it applies too. So far I could only find derivations for example systems (e.g. two level system) and imprecise statements of this notion.

To be a bit more formal: Let there be a general multi-level system with unspecified couplings between the levels and a loss channel from each level. The system is driven by a harmonic field of given frequency and intensity and couples to (some of the) levels via a dipole approximation Hamiltonian. For such a system, under what conditions do the populations¹ only change in second order of the driving field and how can this be proven in general?

As an example a typical Master equation equation for a two level system is

$$\dot{\rho} = \frac{1}{i \hbar} \left[H,\rho\right] - \frac{\gamma}{2} \left(\sigma^+ \sigma^- \rho + \rho \sigma^+ \sigma^- - 2 \sigma^- \rho \sigma^+ \right)$$

where $\sigma^\pm$ are the raising and lowering operators for the two level system, $\rho$ is the system density matrix in the interaction picture and $H=\hbar \Omega (\sigma^+ + \sigma^-)$ is the laser driving Hamiltonian. $\Omega$ is proportional to the driving field and the dipole matrix element.

If one solves this system (e.g. most simply finds the steady state) one will see that the populations¹ are not affected by the driving in linear order of $\Omega$. Note that the question is not about an example, but about formulating this notion in its most general form and how to prove it.

¹ The diagonal elements of the density matrix.

Answer

I will consider a general multi-level system. Like in your example, let us write $\rho$ for the density matrix of the system in the interaction picture. That means that the populations of the system are* the diagonal entries of $\rho$, $$ p_n(t) = \langle n | \rho(t) | n \rangle . $$

Generically, the system Hamiltonian (in the interaction picture) can be split into two parts: $$ H = H^0 + K . $$ $H^0$ is the diagonal part. For example, for the two-level system it is the Lamb/Stark shift proportional to $\sigma^z$, which was neglected in your example. On the other hand, $K$ is the off-diagonal part, i.e. the driving (in your example, the $\sigma^+$ and $\sigma^-$ terms).
Remark: $H$ is often time-dependent in practice. Time-dependence would not change the following much, and I want to keep the notation simple.

The time evolution of the density matrix is then $$ \dot \rho(t) = -\frac{\mathrm i}{\hbar} [H^0 + K, \rho(t)] + \hat D \rho(t) , $$ where $\hat D$ is the dissipative part which is not terribly important for this question. Using this, we can calculate how the populations evolve in time: $$ \dot p_n(t) = \langle n | [K, \rho(t)] | n \rangle + \langle n | \hat D \rho(t) | n \rangle . \tag{1} $$ Note that the first term is zero, because $\langle n | [H^0, \rho(t)] | n \rangle = E^0_n\, p_n(t) - p_n(t)\, E^0_n = 0$, where $E^0_n$ are the eigenvalues of $H^0$.

Dissipation does its work immediately, but the driving will only change the populations after a short while, when $\rho(\Delta t)$ is not diagonal any more: $$ \rho(\Delta t) = \rho(0) + \left( -\frac{\mathrm i}{\hbar} [K, \rho(0)] + \hat D\rho(0) \right) \Delta t $$ so that $$ \dot p_n(\Delta t) = -\frac{\mathrm i}{\hbar} \langle n | [K, [K, \rho(0)]] | n \rangle \Delta t + \cdots . $$

You see that the effect is quite general: The driving field only changes the populations at second order. And its easy to understand, why: (1) tells us that the change of the populations due to $K$ is proportional to the coherences. If the system is initially diagonal, the driving has to create these coherences first.

If the system is not initially diagonal, the whole statement is not true at all. Consider your example without the dissipative part, that can be solved easily. For $\rho(0) = \scriptstyle\begin{pmatrix} 0 & 0 \\ 0 & 1 \end{pmatrix}$, you get $$ \rho(t) = \frac 1 2 \begin{pmatrix} 1 - \cos(2\Omega t) & -\mathrm i \sin(2\Omega t) \\ \mathrm i \sin(2\Omega t) & 1 + \cos(2\Omega t) \end{pmatrix} , $$ the populations change in second order.

After a $\pi/2$-pulse, the system is in the state $\scriptstyle \frac 1 2 \begin{pmatrix} 1 & -\mathrm i \\ \mathrm i & 1 \end{pmatrix}$. Let us for example see what happens if we start in that state, $\rho(0) = \scriptstyle \frac 1 2 \begin{pmatrix} 1 & -\mathrm i \\ \mathrm i & 1 \end{pmatrix}$: $$ \rho(t) = \frac 1 2 \begin{pmatrix} 1+\sin(2\Omega t) & -\mathrm i \cos(2\Omega t) \\ \mathrm i \cos(2\Omega t) & 1-\sin(2\Omega t) \end{pmatrix} . $$ The populations change in the first order.

*Side note: This is only true in the interaction picture. Without the interaction picture, the populations are $p_n(t) = \langle E_n(t) | \rho(t) | E_n(t) \rangle$, where $|E_n(t)\rangle$ are the - in general time-dependent - eigenvectors of the system Hamiltonian.)

Why does a black hole have a finite mass?

I mean besides the obvious "it has to have finite mass or it would suck up the universe." A singularity is a dimensionless point in space with infinite density, if I'm not mistaken. If something is infinitely dense, must it not also be infinitely massive? How does a black hole grow if everything that falls into it merges into the same singularity, which is already infinitely dense?

Answer

If something is infinitely dense, must it not also be infinitely massive?

Nope. The singularity is a point where volume goes to zero, not where mass goes to infinity.

It is a point with zero volume, but which still holds mass, due to the extreme stretching of space by gravity. The density is $\frac{mass}{volume}$, so we say that in the limit $volume\rightarrow 0$, the density goes to infinity, but that doesn't mean mass goes to infinity.

The reason that the volume is zero rather than the mass is infinite is easy to see in an intuitive sense from the creation of a black hole. You might think of a volume of space with some mass which is compressed due to gravity. Normal matter is no longer compressible at a certain point due to Coulomb repulsion between atoms, but if the gravity is strong enough, you might get past that. You can continue compressing it infinitely (though you'll probably have to overcome some other force barriers along the way) - until it has zero volume. But it still contains mass! The mass can't just disappear through this process. The density is infinite, but the mass is still finite.

general relativity - How were the solar masses and distance of the GW150914 merger event calculated from the signal?

The GW150914 signal was observed, giving us the frequency and amplitude of the event. Because LIGO has two detectors a rough source location could be derived.

But how do these three factors allow for the mass of the black holes and their distance to be calculcated? If the wave strengths are in a square relationship to distance, then couldn't there be an infinite number of other masses and distances which would give the same signals?

electromagnetism - Can a free particle absorb/emit photons?

As simple as in the title.. I would like to know also some mathematics about it!

Answer

It cannot. This is because energy and momentum are not both conserved if a free charged particle (say, an electron) emits a photon. It needs interaction with at least a second charged particle in order to do so (as in Bremsstrahlung). The mathematic involved is that of the energy of a photon $E=\hbar \omega$, energy of a particle $E^2 = m^2 c^4 + p^2 c^2$, momentum of a photon $p = \hbar \omega /c$ and simple trigonometry and basic algebra, very much as in the classical version of Compton scattering.

quantum mechanics - Pegg-Barnett phase implementation does not seem to work

I attempt to monitor the phase of a wavevector $|\psi\rangle$.

As I found (e.g. here ), a matrix representation for the Pegg-Barnett phase operator in Fock base can be obtained as

$$\Phi=\sum_{m,n,n'=0}^s\frac{2\pi m}{(s+1)^2}\,\exp\left[\frac{2\pi i m(n-n')}{s+1}\right]|n\rangle\langle n'|$$

for a particle number cutoff $s$. Making the indices start at one instead of zero (for implementation in matlab), this becomes

$$\Phi=\sum_{\tilde{n},\tilde{n'}=1}^\tilde{s}\sum_{m=0}^{\tilde{s}-1}\frac{2\pi m}{\tilde{s}^2}\,\exp\left[\frac{2\pi i m(\tilde{n}-\tilde{n'})}{\tilde{s}}\right]|\tilde{n}\rangle\langle \tilde{n'}|$$ such that $|\widetilde{n+1}\rangle=|n\rangle$.

When I construct the wavefunction though of e.g. a coherent state with $\alpha=\sqrt{30}$; $\langle\alpha|\Phi|\alpha\rangle$ with $\tilde{s}=200$ gives me a phase ~2.7, whereas this coherent state is located in phase-space on the positive X-axis (so that phase zero would be expected.)

What went wrong?

NOTE ADDED: by trial and error, I found that we can obtain a phase prediction of zero if the summation over m also goes up to $\tilde{s}$ and in addition the reference phase $\theta_0$ is chosen $-\pi$ instead of zero. I don't understand why though, and if this is to be trusted as a measure of phase. At first sight, it does not seem to fullfil the small angle approximation $\phi\approx \frac{P}{X}$. Any insight is still welcome.

CODE USED with the original parameters

%% Constructing the initial state
alpha=sqrt(30);Nmax=200;
a=sparse(Nmax,Nmax);for it=1:Nmax-1,  a(it,it+1)=sqrt(it); end;adag=a';  
%annihilation and creation operators for use in the displacement operator.

vacuumpsi=zeros(Nmax,1); vacuumpsi(1,:)=1;
largepsiinit=exp(-0.5*abs(alpha)^2)*(expm(alpha*adag)*vacuumpsi);

%% The PB operator (tildes omitted)
s=Nmax;
PBphaseop=zeros(s);
thetazero=0;
for n=1:s
 for nprime=1:s
  for m=0:s-1

  PBphaseop(n,nprime)=PBphaseop(n,nprime)+(-
  thetazero+2*pi*m/(s))/Nmax*exp(1i*(n-nprime)*2*pi*m/(s));
   end
 end
end

%% Calculate expectation value of phase
phaseexp=largepsiinit'*PBphaseop*largepsiinit

Answer

What works quite nicely is defining the operator from eq.33 in https://arxiv.org/pdf/hep-th/9304036.pdf .

Also, it was to be expected that the PB-phase of a coherent state on the positive real axis is not zero for reference phase $\theta_0=0$, namely the eigenvalue spectrum is bounded by 0 and $2\pi$ and as a coherent state has a finite phase uncertainty, it crosses the cut in phase.

Aside from these issues, this PB-operator used has proven useful in monitoring the phase of a numerical wavevector and gives quite intuitive results. For further improvement, one can also iteratively update the used $\theta_0$ to avoid crossing of the phase-cut.

Wednesday, 23 May 2018

thermodynamics - Why do people wear black in the Middle East?

I have read various dubious explanations as to why people often wear black in the heat, from cultural to somehow encouraging the evaporation of sweat (unconvincing). So, does anyone know what, if any benefit there is to black clothing in hot dry conditions? It is certainly counterintuitive.

special relativity - Length contraction, front middle or back

I still don't have a solid understanding of Length contraction. Imagine we have a ruler of length $L$ that starts at rest upon a ground with markings on it, then accelerates until nearly the speed of light.

here

For a stationary observer, the ruler accelerates at $a$ metres per second squared. Which of the points then, the red, blue, or green points is the one accelerating at $a$? The way I envision it, the faster it gets the more length contracted it is, so the accelerations at the red green and blue points must be different, depending on whether the ruler contracts to the front or to the back, or the middle. If the ruler contracts to the front, the red point is accelerating at $a$, and the green's acceleration must be greater to keep up. If the green point is accelerating at $a$, the red's acceleration must be less to contract. Or perhaps, there is no problem at all, it just depends on which point we take as the reference? Generally would we then take the blue point as the reference when we say a ruler is accelerating at $a$, and that the green point's acceleration is greater than $a$ and the red point's acceleration is less than $a$? This seems an ok solution. But then consider this problem instead, two rulers start side by side and accelerate to very high velocities.

here

Both rulers have rocket boosters, and therefore must to a stationary observer accelerate at the same rate. But when we say this, which acceleration do we mean? Does this mean that since the rocket boosters are at the back of the ruler, that the back of the ruler accelerates at $a$, meaning the front of both rulers must contract to the back, meaning that there will become a space between the rulers to a stationary observer? If this makes sense, than that means that wherever I place the boosters, there will always be a gap developed between the two rulers? I'm not too sure.

Answer

For a stationary observer, the ruler accelerates at $a$ metres per second squared.

One must be careful here.

If all points of the ruler have the same coordinate acceleration $a$, then the ruler length remains $L$ for the stationary observer.

If, on the other hand, the ruler is observed to contract, different points of the ruler have different coordinate acceleration.

Also, note that if $a$ is constant, the proper acceleration $\alpha$ of the ruler diverges as the speed, according to the stationary observer, approaches $c$.

Finally, note that if the points of the ruler maintain constant proper distance and constant proper acceleration, the points of the ruler are Rindler observers which means that the points 'to the rear' of the ruler must have greater proper acceleration than the points 'to the front' of the ruler.

All of this is to say that acceleration of extended bodies is complicated in SR and one must be careful to distinguish between coordinate and proper acceleration.

quantum field theory - Fermions, different species and (anti-)commutation rules

My question is straightforward:

Do fermionic operators associated to different species commute or anticommute? Even if these operators have different quantum numbers? How can one prove this fact in a general QFT?

quantum mechanics - Is there actually a 0 probability of finding an electron in an orbital node?

I have recently read that an orbital node in an atom is a region where there is a 0 chance of finding an electron.

However, I have also read that there is an above 0 chance of finding an electron practically anywhere in space, and such is that orbitals merely represent areas where there is a 95% chance of finding an electron for example.

I would just like to know if there truly is a 0 probability that an electron will be within a region defined by the node.

Many thanks.

Answer

The probability of finding the electron in some volume $V$ is given by:

$$ P = \int_V \psi^*\psi\,dV \tag{1} $$

That is we construct the function called the probability density:

$$ F(\mathbf x, t) = \psi^*\psi $$

and integrate it over our volume $V$, where as the notation suggests the probability density is generally a function of position and sometimes also of time.

There are two ways the probability $P$ can turn out to be zero:

$F(\mathbf x, t)$ is zero everywhere in the volume $V$ - note that we can't get positive-negative cancellation as $F$ is a square and is everywhere $\ge 0$.

we take the volume $V$ to zero i.e. as for the probability of finding the particle at a point

Now back to your question.

The node is a point or a surface (depending on the type of node) so the volume of the region where $\psi = 0$ is zero. That means in our equation (1) we need to put $V=0$ and we get $P=0$ so the probability of finding the electron at the node is zero. But (and I suspect this is the point of your question) this is a trivial result because if $V=0$ we always end up with $P=0$ and there isn't any special physical significance to our result.

Suppose instead we take some small but non-zero volume $V$ centred around a node. Somewhere in our volume the probability density function will inevitably be non-zero because it's only zero at a point or nodal plane, and that means when we integrate we will always get a non-zero result. So the probability of finding the electron near a node is always greater than zero even if we take near to mean a tiny, tiny distance.

So the statement the probability of finding the electron at a node is zero is either vacuous or false depending on whether you interpret it to mean precisely at a node or approximately at a node.

But I suspect most physicists would regard this as a somewhat silly discussion because we would generally mean that the probability of finding the elecron at a node or nodal surface is nebligably small compared to the probability of finding it elsewhere in the atom.

Tuesday, 22 May 2018

energy conservation - If the universe is expanding, what is it expanding into?

If the universe is expanding, what is it expanding into?

When the big bang happened where did it occur?

When the big bang happened how did it occur?

Where did the energy come from? Energy can not be created or destroyed does that mean, energy has existed before the universe was here?

astrophysics - When do stars become red giants?

I am a bit confused when do stars become red giants? Is it just after they have finished core H burning and the core contracts creating high temperatures which result in core He burning to occur which creates outward pressures pushing the outer lowers apart or something different? Also does core He burning and shell H burning (by burning I mean fusion) occur at the same time and if not which comes first?? Or am i totally wrong with all these points?

Answer

Is it just after they have finished core H burning and the core contracts creating high temperatures which result in core He burning...?

It is after the core finishes H burning, but He burning is not required. Hydrogen shell burning is sufficient to make it a red giant. Helium burning would make it a Horizontal Branch Star. See good explanation here: http://abyss.uoregon.edu/~js/ast122/lectures/lec16.html

Also does core He burning and shell H burning (by burning I mean fusion) occur at the same time and if not which comes first??

Shell H burning starts first, but can continue after He burning starts.

quantum mechanics - Why does my thought experiment involving entanglement appear to violate the speed of light?

I have a thought experiment that "seems" to allow messages passed at faster than the speed of light. Since that is impossible I'd like to learn what is wrong with my experiment/reasoning. Start with the classic setup where 2 particles (spin up and spin down) are entangled and then sent in opposite directions to locations A and B (possibly even light years apart). Now picture an ensemble of these (each pair is independent of the other pairs). If no observations (measurements) are made at location B then there must be results of measurements at location A that demonstrate superposition, and these results must look different than if observations were made at location B. So now picture an ensemble sent out every second and measurements (which won't start until the first emsembles reach A and B) are made also every second. Now to send a message from B to A one just needs, for each ensemble arriving at B each second, to either observe the ensemble (call it bit '1') or not observe (bit '0'). Then, simultaneously, at A, one just needs to do measurements to determine if each incoming ensemble is in superposition states or not. If superposition then bit='0', if not, then bit='1'. Since the sending and receiving of the messages is simultaneous there is the violation of the speed of light.

Monday, 21 May 2018

quantum mechanics - What constitutes measuring in the double slit experiment?

In the double slit experiment attempting to measure which slit the particle passed through causes the wave function to collapse.

According to the question: What is the quantum mechanical definition of a measurement?

Until we have an accepted solution of the Measurement Problem there is no definitive definition of quantum measurement, since we don't know exactly what happens at measurement.

And:

The many-worlds interpretation defines measurement as any physical procedure in which the observer gets entangled with a quantum system.

To me, the most obviously arising avenue of investigation would be to narrow down on precisely what does or doesn't cause the wave function to collapse.

Have physicists extensively experimented with what conditions cause it to collapse? Do you have to be taking a measurement to make it collapse or will any interaction cause it to collapse? For example, what if you measure it with an apparatus that then destroys the data gathered without allowing the data to exit a faraday cage, so that it is impossible for any information to ever be accessible to the outside universe? Then what ever the answer, one might invent even more obscure hypothetical circumstance to test...

Has this avenue been explored?

general relativity - Why doesn't $ds^2 = 0$ imply two distinct points $p$ and $p'$ on a manifold are the same point?

Let's suppose I have a spacetime manifold $M$. Let $p$ be a point on my manifold. Now I move from $p$ to some other point $p'$. Presumably I should have moved some "distance" right? How can I speak of notions of space and time if I have no conception of distance?

But now consider light moving through spacetime. Suppose my light starts at $p = (0,0,0,0)$ and travels to $p' = (1,1,0,0)$. By the definition of the spacetime interval $ds^2 = dt^2 - dx^2 - dy^2 - dz^2$, this should mean $ds^2 = (1)^2 - (1)^2 - 0 - 0 = 0$. So $ds^2=0$.

Yet I have moved from point $p$ to $p'$. So I clearly have moved along some path along the curve, but the length of this path is zero. Shouldn't that mean $p$ and $p'$ are the same point?

Note: I think I may be suffering from an overly Euclidean mindset and my brain hasn't adapted yet enough to the non-Euclidean logic of semi-Riemannian manifolds.

My Question

Can someone resolve this contradiction?

Answer

Let's separate out some definitions:

metric(1): Given a set $X$, a function $d : X \times X \to \mathbb{R}$ such that the following axioms hold for all $x,y,z \in X$:

$d(x,y) \geq 0$,

$d(x,y) = 0 \Leftrightarrow x = y$,

$d(x,y) = d(y,x)$, and

$d(x,z) \leq d(x,y) + d(y,z)$.

pseudo-metric(1): Given a set $X$, a function $d : X \times X \to \mathbb{R}$ such that the following axioms hold for all $x,y,z \in X$:

$d(x,x) = 0$,

$d(x,y) = d(y,x)$, and

$d(x,z) \leq d(x,y) + d(y,z)$.

metric(2): (aka "inner product") Given a vector space $V$ over a field $F$, which is either $\mathbb{R}$ or $\mathbb{C}$, a function $g : V \times V \to F$ such that the following axioms hold for all $x,y,z \in V$ and $a \in F$:

$g(x,y) = \overline{g(y,x)}$;

$g(ax,y) = a g(x,y)$,

$g(x+y,z) = g(x,z) + g(y,z)$,

$g(x,x) \geq 0$, and

$g(x,x) = 0 \rightarrow x = 0$.

pseudo-metric(2): (aka "pseudo inner product") Given a vector space $V$ over a field $F$, which is either $\mathbb{R}$ or $\mathbb{C}$, a function $g : V \times V \to F$ such that the following axioms hold for all $x,y,z \in V$ and $a \in F$:

$g(x,y) = \overline{g(y,x)}$;

$g(ax,y) = a g(x,y)$,

$g(x+y,z) = g(x,z) + g(y,z)$, and

$\exists\ v \in V : g(x,v) \neq 0$.

Now you want to define a distance between points on a manifold. You are intuitively looking for a (pseudo-)metric(1) here, a distance function on a set without any extra structure. The problem is all you are given is a (pseudo-)metric(2) on the tangent space at each point. Your (pseudo-)metric(2) can only give you magnitudes of tangent vectors at points. Intuitively, these are "infinitesimal distances." You need to integrate such magnitudes along a path in order to get distances between points.

But this is the crux of the issue: What path do you choose? Even for a nice manifold like the surface of a 2-sphere (that is, something with a real metric(2), not just a pseudo-metric(2), on its tangent bundle), the distance between points is path dependent. You could fly directly from New York to London along a great circle (geodesic), or you could stop by in Beijing.

If you have positive-definiteness working for you, you could take the infimum over all paths from one point to another. Consider curves of the form \begin{align} \gamma : [0,1] & \to M \\ \lambda & \mapsto p \\ 0,1 & \mapsto p_1,p_2. \end{align} Then $$ d(p_1,p_2) = \inf_\gamma \int_0^1 \left(g_p \left(\frac{\mathrm{d}p}{\mathrm{d}\lambda}, \frac{\mathrm{d}p}{\mathrm{d}\lambda}\right)\right)^{1/2} \, \mathrm{d}\lambda $$ defines a distance function in the metric(1) sense as long as $g_p$ is an honest metric(2) inner product at each $p$.

Unfortunately, when you try this with a Lorentzian manifold equipped with a pseudo-metric(2), the construction fails to produce anything useful. Even taking an absolute value before the square root, there will always be a piecewise differentiable null path between any two points. Thus there will be differentiable curves of length arbitrarily close to $0$, and so the pseudo-metric(1) you induce is trivial: all distances are $0$.

Sunday, 20 May 2018

momentum - The elusive difference between force and impulse

Impulse is defined as the product of a force $F$ acting for a (short) time $t$, $J = F*t$, and that is very clear. What I find difficult to understand is how a force can exist that doesn't act for a time.

If we consider the most common and observable force: gravity, the force of gravity is defined as $m*g$ and for a body of 1 Kg of mass is equivalent to $\approx$ 10 N.

But whenever we consider gravity we must consider the time, if a book falls from the table to the ground (h = .8m) the force acts for a (short) time t = 0.4 sec.

Is there/can there be a force that doesn't act for a time?

Can you explain why do not refer to the fall of the book as the impulse of gravity?

Why if the same (short) time happens in a collision we call it an impulse?

Isn't always a force actually an impulse?

update:

I'm not sure it's helpful to think about the gravitational force, because I can't see a similar physical system where we can imagine the gravitational force deliverting a non-zero impulse in zero time. - John Rennie

If I got it right, you are saying that we must consider it impulse when $t=0$?, else it is force.

But, also when the book falls to the ground because of gravity there is a change of momentum, why is that not impulse? That is the elusive difference, for me.

force is not defined over a billion years, but:

a force is any interaction which tends to change the motion of an object.[1] In other words, a force can cause an object with mass to change its velocity (which includes to begin moving from a state of rest), i.e., to accelerate. Force can also be described by intuitive concepts such as a push or a pull.

therefore also in a collision there is a push on a ball, exactly the same as here: there is a push on the book that tends to change its motion. What is the difference?

Answer

It's hard to think of a physical system involving a force that acted for zero time. However I think it's useful to consider a collision, perhaps between two billiard balls.

When the balls collide they change momentum. We know that the change of momentum is just the impulse, and we know that the impulse is given by:

$$ J = \int F(t)\,dt $$

where I've used an integral because the force is generally not be constant during the collision.

If we use soft squidgy balls then the collision will take a relatively long time as the balls touch, then compress each other, then separate again. If we use extremely hard balls the collision will take a much shorter time because the balls don't deform as much. With the soft balls we get a low force for a long time, with the hard balls we get a high force for a short time, but in both cases (assuming the collision is elastic) the impulse (and change of momentum) is the same.

When we (i.e. undergraduates) are calculating how the balls recoil we generally simplify the system and assume that the collision takes zero time. In this case we get the unphysical situation where the force is infinite but acts for zero time, but we don't care because we recognise it as the limiting case of increasing force for decreasing duraction and we know the impulse remains constant as we take this limit.

I'm not sure it's helpful to think about the gravitational force, because I can't see a similar physical system where we can imagine the gravitational force deliverting a non-zero impulse in zero time.

Response to edit:

In you edit you added:

If I got it right, you are saying that we must consider it impulse when t=0?, else it is force.

I am saying that if we use an idealised model where we take the limit of zero collision time the impulse remains a well defined quantity when the force does not.

However I must emphasise that this is an ideal never achieved in the real world. In the real collisions the force and impulse both remain well behaved functions of time and we can do our calculations using the force or using the impulse. We normally choose whichever is most convenient.

I think Mister Mystère offers another good example. If you're flying a spacecraft you might want to fire your rocket motor on a low setting for a long time or at maximum for a short time. In either case what you're normally trying to do is change your momentum, i.e. impulse, by a preset amount and it doesn't matter much how you fire your rockets as long as the impulse reaches the required value.

Response to response to edit:

I'm not sure I fully grasp what you mean regarding the book, but the force of gravity acting on the book does indeed produce an impulse. Suppose we drop the book and it falls for a time $t$. The force on the book is $mg$ so the impulse is:

$$ J = mgt $$

To see that this really is equal to the change in momentum we use the SUVAT equation:

$$ v = u + at $$

In this case we drop the book from rest so $u = 0$, and the acceleration $a$ is just the gravitational acceleration $g$, so after a time $t$ the velocity is:

$$ v = gt $$

Since the initial momentum was zero the change in momentum is $mv$ or:

$$ \Delta p = mgt $$

Which is exactly what we got when we calculated the impulse so $J = \Delta p$ as we expect.

general relativity - Are there more distinctive names of "null curves" with certain additional properties (absence of "chord curves")?

In this answer (to the question "In general relativity, are light-like curves light-like geodesics?", PSE/q/76170) a particular example of a curve is discussed whose "tangent is everywhere null" and which is therefore called a "null curve". I'll restate the example curve explicitly as

$$\nu : \mathbb R \rightarrow \cal M,$$

together with a coordinate function

$$\mathbf r : \cal M \rightarrow \mathbb R^{1,2}; \qquad \mathbf r := \{~t,~x,~y~\}$$

such that

$$\mathbf r \circ \nu[~\lambda~] := \{~t_{\nu}[~\lambda~],~x_{\nu}[~\lambda~],~y_{\nu}[~\lambda~]~\} = \{~\lambda,~\text{Cos}[~\lambda~],~\text{Sin}[~\lambda~]~\}.$$

The calculation of the corresponding "tangent" value proceeds (roughly) via

$$\begin{align} \left( \frac{d}{d\lambda}[~t_{\nu}[~\lambda~]~] \right)^2 - \left( \frac{d}{d\lambda}[~x_{\nu}[~\lambda~]~] \right)^2 - \left( \frac{d}{d\lambda}[~y_{\nu}[~\lambda~]~] \right)^2 & = & \\ 1 - \left( -\text{Sin}[~\lambda~] \right)^2 - \left( \text{Cos}[~\lambda~] \right)^2 & = 0\end{align}$$ for all values $\lambda$ in the domain of "null curve" $\nu$.

Now, interestingly, for any two distinct values $\lambda := a$ and $\lambda := b$ from the domain of "null curve" $\nu$, there exist curves (which in the following are suggestively called "chord curves") $$\kappa_{ab} : [~a, ~b~] \subset \mathbb R \rightarrow \cal M,$$ such that $$\kappa_{ab}[~a~] = \nu[~a~], \qquad \kappa_{ab}[~b~] = \nu[~b~]$$ and: the "tangent of $\kappa_{ab}$" is everywhere positive.

As one concrete case consider $$\mathbf r \circ \kappa[~\lambda~] := \{~t_{\kappa}[~\lambda~],~x_{\kappa}[~\lambda~],~y_{\kappa}[~\lambda~]~\} = $$ $$\{~\small{\lambda,~\text{Cos}[~a~] + \left( \frac{\lambda - a}{b - a} \right) \left( \text{Cos}[~b~] - \text{Cos}[~a~] \right),~\text{Sin}[~a~] + \left( \frac{\lambda - a}{b - a} \right) \left( \text{Sin}[~b~] - \text{Sin}[~a~] \right)~}\},$$

with the corresponding "tangent" value

$$\left( \frac{d}{d\lambda}[~t_{\kappa}[~\lambda~]~] \right)^2 - \left( \frac{d}{d\lambda}[~x_{\kappa}[~\lambda~]~] \right)^2 - \left( \frac{d}{d\lambda}[~y_{\kappa}[~\lambda~]~] \right)^2 = $$ $$\small{1 - \left( \frac{\text{Cos}[~b~] - \text{Cos}[~a~]}{b - a} \right)^2 - \left( \frac{\text{Sin}[~b~] - \text{Sin}[~a~]}{b - a} \right)^2 = 1 - 4 \left( \frac{\text{Sin}[~(b - a)/2~]}{b - a} \right)^2 \gt 0.}$$

(As a sidenote: similarly, for given distinct values $\lambda := a$ and $\lambda := b$ from the domain of "null curve" $\nu$, one may ask for "chord curves" whose "tangent" value should be everywhere negative; but I have not been able to construct a corresponding concrete example. In the following, it is not necessary to distinguish "positive chord curves" from "negative" ones, if cases of the latter exist at all. Relevant is only that the "tangent" value of a "chord curve" exists everywhere and is not "null" anywhere.)

My question is:
Is there some specific name (i.e. more distinctive than "just another null curve") referring to "null curves" which do not have any "chord curves" at all? (Are such special cases of "null curves" perhaps called "null geodesics"?)

Saturday, 19 May 2018

electromagnetic radiation - Why is titanium dioxide transparent for visible light but not for UV?

I wonder the reason for TiO2 thin films to be transparent for visible light but not for UV. I made a quick search and I found that it is due to the band gap of TiO2. It absorbs UV light but not visible light. I imagine this occurs because of the different wavelengths of these two types of radiation. But what is the relation between the wavelength of a certain type of radiation and the width of the band gap a semiconducting material? And how does this effect its optical properties?

Answer

The energy per photon of light with wavelegth $\lambda$ is given by:

$$ E = \frac{hc}{\lambda} $$

If the energy per photon is smaller than the band gap the light cannot excite electrons from the valence to conduction band so it will pass through the material without being absorbed. If the energy is larger than the band gap the light will excite electrons and will be (partially) absorbed. The cutoff wavelength is given by simply rearranging the above formula to get:

$$ \lambda \approx \frac{hc}{\Delta E} $$

where $\Delta E$ is the band gap. I've used the approximately equal sign because band gaps are rarely sharp and the light absorbtion will increase over a wavelength range of around the cutoff wavelength. If you want to establish the band gap accurately you'd use a Tauc plot.

electromagnetism - Derivation of Maxwell stress tensor from EM Lagrangian

From Noether's theorem applied to fields we can get the general expression for the stress-energy-momentum tensor for some fields:

$$T^{\mu}_{\;\nu} = \sum_{i} \left(\frac{\partial \mathcal{L}}{\partial \partial_{\mu}\phi_{i}}\partial_{\nu}\phi_{i}\right)-\delta^{\mu}_{\;\nu}\mathcal{L}$$

The EM Lagrangian, in the Weyl gauge, is:

$$\mathcal{L} = \frac{1}{2}\epsilon_{0}\left(\frac{\partial \vec{A}}{\partial t}\right)^{2}-\frac{1}{2\mu_{0}}\left(\vec{\nabla}\times \vec{A}\right)^{2}$$

Applying the above, all I manage to get for the pressure along x, which I believe corresponds to the first diagonal element of the Maxwell stress tensor, is:

$$p_{x} = \sigma_{xx} = -T^{xx} = \frac{-1}{\mu_{0}}\left(\left(\partial_{x}A_{z}\right)^{2}-\partial_{x}A_{z}\partial_{z}A_{x}-\left(\partial_{x}A_{y}\right)^{2}+\partial_{x}A_{y}\partial_{y}A_{x}\right)+\mathcal{L}$$

But I can't see how this can be equal to what is given in Wikipedia.Why is this?

Answer

Hint: The canonical stress-energy tensor from Noether's theorem is not necessarily symmetric, and often needs to be improved with appropriate improvements terms. This is e.g. the case for EM. See also e.g. this Phys.SE post and links therein.

References:

Landau and Lifshitz, Vol.2, The Classical Theory of Fields, $\S$33.

Why do we get saturation current in photoelectric effect

So apparently if you increase the number of photons emitted per second to infinity, the photocurrent will approach a limiting value called the 'saturation current'

I feel like we shouldn't get a saturation current because as one electron is emitted, another will immediately (or close to immediately) fill its place just like in a normal current - thus electrons can pretty much be continuously emitted.

Can anyone explain why we get saturation current? Thanks :D

Why is the photon emitted in the same direction as incoming radiation in Laser?

When an atom “lases” it always gives up its energy in the same direction and phase as the incoming light. Why does this happen? How can this be explained? How does the photon generated because of stimulated emission, know which direction to take? What are the factors leading to this?

Answer

The word "stimulated" means that the emission of the photon is "encouraged" by the existence of photons in the same state as the state where the new photon may be added. The "same state" is one that has the same frequency, the same polarization, and the same direction of motion. Such a one-photon state may be described by the wave vector and the polarization vector, e.g. $|\vec k,\lambda\rangle$.

The physical reason why photons like to be emitted in the very same state as other photons is that they are bosons obeying the Bose-Einstein statistics. The probability amplitude for a new, $(N-1)$-st photon to be added into a one-photon state which already has $N$ photons in it is proportional to the matrix element of the raising operator $$ \langle N+1| a^\dagger|N\rangle = \sqrt{N+1}$$ of the harmonic oscillator between the $N$ times and $(N+1)$ times excited levels. Because the probability amplitude scales like $\sqrt{N+1}$, the probability for the photon to be emitted into the state goes like the squared amplitude i.e. as $N+1$. Recall that $N$ is the number of photons that were already in that state.

This coefficient $N+1$ may be divided to $1$ plus $N$. The term $1$ describes the probability of a spontaneous emission – that occurs even if no other photons were present in the state to start with – while the term $N$ is the stimulated emission whose probabilities scales with the number of photons that are already present.

But in all cases, we must talk about "exactly the same one-photon state" which also means that the direction of the motion is the same. It's because quantum field theory associates one quantum harmonic oscillator with each state i.e. with each information $\vec k$ about the direction of motion and wavelength; combined with a binary information about $\lambda$, the polarization (e.g.left-handed vs right-handed).

general relativity - Why the basis of vectors and one-forms can not be related through the metric as a vector and one-forms?

I know that basis vector and basis of one-forms are related through

$$ \tilde{e}^\mu \cdot \vec{e}_\nu = \delta^\mu _\nu .\tag{1}$$

However, the metric has the property that allows to convert vectors to one-forms. So, can I try to say this: $$ \tilde{e}^\mu = g^{\mu \nu} \vec{e}_\nu , \tag{2}$$

if not explain me, please, maybe I don't clear with subscripts or can be some further.

Answer

First I'll state three quick preliminaries so we're both on the same page, and then I'll answer the question. In the following, I'm going to use tildes to distinguish one-forms and their components from vectors and their components; it is traditional to drop this extra notation and simply write $X_\mu = g_{\mu \nu}X^\nu$ as though the $X$'s on either side are the same object. However, since they are emphatically not the same object, I use the notation $\tilde X_\mu = g_{\mu \nu} X^\nu$ to make this explicit.

Preliminary #1: The Metric

The metric $\mathbf{g}$ is an object which linearly eats two vectors and spits out a scalar (e.g. a real number). The components of $\mathbf{g}$ in a particular basis are what you get when you feed the metric the basis vectors:

$$g_{\mu \nu} \equiv \mathbf{g}(\hat e_\mu,\hat e_\nu)$$

Therefore, you often see the action of $\mathbf{g}$ on two vectors $\mathbf{X}=X^\mu \hat e_\mu$ and $\mathbf{Y} = Y^\nu \hat e_\nu$ written like this:

$$\mathbf{g}(\mathbf X,\mathbf Y)=\mathbf{g}(X^\mu \hat e_\mu,Y^\nu \hat e_\nu) = X^\mu Y^\nu \mathbf{g}(\hat e_\mu,\hat e_\nu) = X^\mu Y^\nu g_{\mu \nu}$$

We can pull the components $X^\mu,Y^\nu$ out front because $\mathbf g$ is linear.

Preliminary #2: One-forms

A one-form, or covector, is an object which linearly eats a vector and spits out a scalar. Note that the following is a one-form:

$$\tilde{\mathbf X} := \mathbf{g}(\mathbf{X},\bullet)$$

From a vector $\mathbf X$ and the metric $\mathbf g$, we can construct a one-form $\tilde{\mathbf X}$ by plugging $\mathbf{X}$ into the first slot of $\mathbf g$ and leaving the second slot empty. We say that the one-form $\tilde{\mathbf X}$ is dual to the vector $\mathbf X$.

$\tilde{\mathbf X}$ then acts on some vector $\mathbf Y$ in the obvious way:

$$\tilde{\mathbf X}(\mathbf Y) = \mathbf{g}(\mathbf X,\mathbf Y)$$

In particular, if we feed $\tilde{\mathbf X}$ a basis vector $\hat e_\nu$, we get

$$\tilde{\mathbf X}(\hat e_\nu) = \mathbf{g}(\mathbf X,\hat e_\nu) = X^\mu \mathbf{g}(\hat e_\mu, \hat e_\nu) = X^\mu g_{\mu \nu}$$ If we define the one-form basis $\hat \epsilon^\mu$ to have the property $\hat \epsilon^\mu (\hat e_\nu) = \delta^\mu_\nu$, then we can expand $\tilde{\mathbf X}$ in its components $\tilde X_\mu$. Performing the same action,

$$\tilde{\mathbf X}(\hat e_\nu) = \tilde X_\mu \hat \epsilon^\mu(\hat e_\nu) = \tilde X_\mu \delta^\mu_\nu = \tilde X_\nu$$

Comparing this to what we got before, we see that $\tilde X_\nu = X^\mu g_{\mu\nu}$.

Note that even though we say one forms eat vectors and spit out scalars, we can also say that vectors eat one-forms and spit out scalars. We simply define the action of a vector $\mathbf X$ on a one-form $\tilde{\mathbf Y}$ to be $$\mathbf X(\tilde{\mathbf Y}) := \tilde{\mathbf Y}(\mathbf X)$$ This will be relevant in a moment.

Preliminary #3: The Inverse Metric

We've seen that we can use the metric to associate a vector $\mathbf X$ to a dual one-form $\tilde{\mathbf X}$; we can also go the other direction and associate a covector $\tilde{\mathbf Y}$ to a dual vector $\mathbf Y$. We do this by defining the so-called inverse metric $\tilde{\mathbf g}$, which is a map which eats two one-forms and spits out a scalar.

Essentially this is just a metric on the space of one-forms in exactly the same way as $\mathbf g$ is a metric on the space of vectors. However, we link them together by demanding that if the vectors $\mathbf X$ and $\mathbf Y$ have one-form duals $\tilde{\mathbf X}$ and $\tilde{\mathbf Y}$, then

$$\mathbf{g}(\mathbf X,\mathbf Y) = \tilde{\mathbf g}(\tilde{\mathbf X},\tilde{\mathbf Y})$$

It's a straightforward exercise to show that this means that the components of the dual metric $\tilde g^{\mu \nu}$ satisfy the following relationship to the components of the metric:

$$\tilde g^{\mu \nu} g_{\nu \rho} = \delta^\mu_\rho$$

This means that if we express them in matrix form, the $\tilde g^{\mu\nu}$'s are the matrix inverse of the $g_{\mu\nu}$'s - hence the name "inverse metric". We can now use this to associate $\tilde{\mathbf Y}$ with a vector:

$$\mathbf Y = \tilde{\mathbf g}(\tilde{\mathbf{Y}},\bullet)$$ It's straightforward to demonstrate (by feeding the basis one-form $\hat\epsilon^\mu$ to $\mathbf Y$ as defined above) that the components of $\mathbf Y$ are, as expected, given by $$Y^\mu = \tilde{g}^{\mu \nu}\tilde Y_\nu$$

Now your question can be answered. It's true that the metric can "convert" vectors into one-forms in an abstract sense. However, when we talk about "raising" and "lowering" indices like you are doing, we are converting the components of a vector to the components of the corresponding one-form.

The expression that you wrote ($\tilde g^{\mu \nu} \hat e_\nu$, in my notation) is simply a linear combination of vectors, and is therefore not a one-form as defined here. It is, however, what you get when you convert the basis one-form $\hat \epsilon^\mu$ into a vector, which I will now show.

Note that the one-form dual to the unit vector $\hat e_\mu$ is not the basis one-form $\hat \epsilon^\mu$; it is the one-form

$$\tilde{\boldsymbol \omega} := \mathbf{g}(\hat e_\mu, \bullet)$$

which has components

$$\tilde \omega_{\nu} \equiv \tilde{\boldsymbol\omega}(\hat e_\nu) = \mathbf{g}(\hat e_\mu, \hat e_\nu) = g_{\mu \nu}$$

To be concrete, the dual to the basis vector $\hat e_0$ is the one form $$\tilde{\boldsymbol \omega} = g_{0\nu}\hat \epsilon^\nu = g_{00}\hat \epsilon^0 + g_{01}\hat \epsilon^1 + g_{02}\hat \epsilon^2 + g_{03} \hat \epsilon^3$$

Using the inverse metric in precisely the same way, the dual to the basis one-form $\hat \epsilon^0$ is the vector

$$\boldsymbol \omega = \tilde g^{0\nu}\hat e_\nu = \tilde g^{00}\hat e_0 + \tilde g^{01} \hat e_1 + \tilde g^{02} \hat e_2 + \tilde g^{03} \hat e_3$$

To conclude, we do not have that $$\hat\epsilon^\mu = g^{\mu \nu} \hat e_\nu$$ but rather that $$\underbrace{\tilde{\mathbf g}(\hat \epsilon^\mu,\bullet)}_{\text{The vector *dual* to the basis one-form }\hat\epsilon^\mu} = g^{\mu \nu}\hat e_\nu$$

friction - Why does a tire produce more traction when sliding slightly?

It is well known in racing that driving the car on the ideal "slip angle" of the tire where it is crabbing slightly from the pointed direction produces more cornering speed than a lower slip angle or a higher one.

(More explanation as requested) I'm considering two main effects on the tire when in a turn:

The tread of the tire is twisted from the angle of the wheel it is mounted to. There is more force as speed increases, and generally, more twisting.

The tire slides somewhat at an angle on the road surface rather than rolling.

At low speeds, the angle between the pointed direction of the wheel (90 degrees to the axis of rotation) and the direction of travel is nearly 0. When the speed increases to the point the angle reaches about 10 degrees, the tire generate more grip and the car goes faster around the turn. (Higher angles produce lower grip)

So the grip is higher at 10 degrees of slip than at 0 or 20 degrees.

What is the physical effect that causes this increase in grip?

special relativity - Metric tensor of coordinate transformation

How do you find a metric tensor given a coordinate transformation, $(t', x', y', z') \rightarrow (t, x, y, z)$? Our textbook gives a somewhat vague example as it skips some steps making it difficult to understand. What's the general definition for a metric tensor of a given transformation? The closest I could find was http://en.wikipedia.org/wiki/Metric_tensor#Coordinate_transformations, but I'm having trouble understanding that.

Answer

You look at the distance between two infinitesimally different points. Let the two coordinate systems be x and y, where x is four numbers and y is four numbers. Consider an infinitesimal displacement from y to y+dy. You know this distance in the x coordinates, so you find the two endpoints of the displacement

$$x(y)$$ $$x^i(y + dy) = x^i(x') + {\partial x^i \over \partial y^j} dy^j $$

This is using the Einstein summation convention--- repeated upper/lower indices are summed automatically, and an upper index in the denominator of a differential expression becomes a lower index, and vice-versa. The distance between these two infinitesimally separated points is:

$$ g_{ij}(x) {\partial x^i \over \partial y^k} {\partial x^j \over \partial y^l} dy^k dy^l $$

And from this, you read off the metric tensor coefficients--- since this is the quadratic expression for the distance between y and y+dy.

$$ g'_{kl}(y) = g_{ij}(x(y)) {\partial x^i \over \partial y^k} {\partial x^j \over \partial y^l}$$

This is a special case of the tensor transformation law--- every lower index transforms by getting contracted with a Jacobian inverse, and every upper index by getting contracted with a Jacobian.

Friday, 18 May 2018

quantum field theory - What are the assumptions that $C$, $P$, and $T$ must satisfy?

I am not asking for a proof of the $CPT$ theorem. I am asking how the $CPT$ theorem can even be defined.

As matrices in $O(1,3)$, $T$ and $P$ are just $$ T = \begin{pmatrix} -1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{pmatrix} \hspace{1cm} P = \begin{pmatrix} 1 & 0 & 0 & 0 \\ 0 & -1 & 0 & 0 \\ 0 & 0 & -1 & 0 \\ 0 & 0 & 0 & -1 \end{pmatrix} $$ These satisfy certain properties. For one, as matrices, $T^2 = 1$, $P^2 = 1$. (Therefore any homomorphism of $O(1,3)$ must also satisfy this property.) Working with these matrices, it can be shown that rotations in $\mathfrak{so}(1,3)$ commute with $T$ while boosts anti-commute with $T$ and $P$. This is just the definition of $T$ and $P$ as elements in $O(1,3)$.

In quantum field theory, we require that our Hilbert space carries a (projective) representation of $SO^+(1,3)$, where $SO^+(1,3)$ is a special orthochronous Lorentz group, i.e. the part $SO(1,3)$ connected to the identity. (In other words, we want a true representation of $Spin(1,3)$.) We can define how local operators $\mathcal{O}_\alpha(x)$ transform via conjugation. Namely, for all $\tilde\Lambda \in Spin(1,3)$, we want $$ U(\tilde\Lambda) \mathcal{O}_\alpha(x) U(\tilde\Lambda)^{-1} = D_{\alpha \beta}(\tilde \Lambda) \mathcal{O}_\beta (\Lambda x) $$ where $\Lambda \in SO(1,3)$ is the corresponding element of $\tilde{\Lambda}$ and $D_{\alpha \beta}$ must be a representation of $Spin(1,3)$.

This is a great way to do things. Our requirements for $U$ and $\mathcal{O}_\alpha$ have physical motivation, and it gives us a task: find representations of $Spin(1,3)$ and define quantum fields out of them.

What this approach does not offer, at first glance, it how to incorporate $T$ or $P$, let alone $C$. We know that we can't just even look for group homomorphisms from $O(1,3)$ operators on our Hilbert space, because we all know that $\hat P^2 = 1$ need not be true in quantum field theory. What physically motivated mathematical requirements do we have to put on $C$, $P$, and $T$ that should "determine" them (in a suitable sense) for different Hilbert spaces we've constructed. The introduction of $C$ is especially confusing, because it requires us to swap particle states and anti-particle states, but such states usually defined via words (i.e., this a particle, this is an anti-particle, here's how they work...). After placing proper requirements on $C$, $P$, and $T$, one should in theory be able to prove the $CPT$ theorem, show $T$ must be anti-unitary, etc. I know this is a big question, so references that discuss these subtleties would also be appreciated.

Answer

Mathematical physicists will tell you the question you're asking has no answer: only CPT as a whole has a rigorous definition. That means that practicing physicists, who consider concrete problems, are free to define it however they want! So while I don't know the mathematical niceties, let me lay out what I think particle physicists usually mean when they say "a P/C/T transformation".

Particles and antiparticles

Recall that a quantum field has the generic mode expansion $$\hat{\psi}(x) = \sum_{p, s} a_{p, s} u_s(p) e^{-ipx} + b^\dagger_{p, s} v_s(p) e^{ipx}$$ where $p$ stands for the momentum and $s$ stands for all other internal quantum numbers, such as spin, and $u_s(p)$ and $v_s(p)$ are polarizations. The $a_{p, s}$ and $b^\dagger_{p, s}$ are annihilation and creation operators. It is clear that the $a$-modes and $b$-modes are qualitatively different: we can't transform one into another by a Lorentz transformation because the modes have positive and negative frequencies, and the operators $a^\dagger_{p, s}$ and $b^\dagger_{p, s}$ transform oppositely under internal symmetries because $a_{p, s}$ and $b^\dagger_{p, s}$ must transform the same way. This implies, e.g., that they must create particles of opposite electric charge.

To account for this difference, we conventionally call one of these excitations "particles" and the other "antiparticles". Of course, which is which is just a matter of convention; the point is that there's a real distinction to be made here. (The fact that there are two different species at all is because the modes can have positive or negative frequency, and that's a consequence of Lorentz invariance; you don't need to have two types of modes in nonrelativistic field theory. That's what people mean when they say relativistic QFT predicts antimatter.)

Quantum discrete symmetries

Rough and ready naive definitions of parity, charge conjugation, and time reversal are:

Parity: $\hat{P} a_{p, s} \hat{P}^{-1} = a_{p', s}$ where $p'$ is $p$ with flipped $3$-momentum and $s$ stays the same.

Charge conjugation: $\hat{C} a_{p, s} \hat{C}^{-1} = $ any antiparticle annihilation operator with the same $p$ and $s$. (Not necessarily $b_{p, s}$ in the expression above.)

Time reversal: $\hat{T} a_{p, s} \hat{T}^{-1} = a_{p', s'}$ where $\hat{T}$ is antilinear, $s'$ has the spin $s$ flipped.

These requirements are directly derived from what we expect classically. They are already nontrivial. For instance, in a theory of a single Weyl spinor it is impossible to define $\hat{P}$ because if $a_{p, s}$ exists, then $a_{p', s}$ does not, because it would have the wrong helicity. It's also impossible to define $\hat{C}$, again because there is nothing for $a_{p, s}$ to map to. Similarly one can prove the electroweak theory is not $\hat{P}$ or $\hat{C}$ symmetric, though both can be defined.

From these definitions alone it's easy to show all the familiar properties. For example, using the mode expansion, you can show that the quantum field itself transforms how you would expect. For instance, under parity, $\hat{\psi}(\mathbf{x}, t)$ is mapped to $P \hat{\psi}(-\mathbf{x}, t)$ where $P$ is a numerical matrix that can shuffle up the field components. So I imagine one could define the discrete symmetries directly by how they act on fields, though that would probably be clunkier.

More general definitions

People will often use definitions that are more general. For example, charge conjugation is not a symmetry of QED unless you allow the photon creation/annihilation operators to pick up an extra minus sign. So conventionally we allow all these discrete symmetries to be defined up to phases. Allowing this gives us a symmetry to work with, which yields nontrivial information, while sticking to the strict definition gives us nothing.

As a more drastic step, in left-right symmetric models one might have a gauge group like $SU(2)_L \times SU(2)_R$, and one can define "generalized parity" to send $\mathbf{x} \to -\mathbf{x}$ and swap these two gauge groups. That's a big change, but the spirit is the same: it's a discrete symmetry of the theory we can use to constrain dynamics, and it has some features in common with parity, so we call it that. This is useful because the point of these models is to make the $\theta$-term of QCD vanish, and this generalized parity does the trick.

Classical discrete symmetries

It should be cautioned that there are another three things commonly called parity, charge conjugation, and time reversal which are completely different. These are discrete symmetries of classical fields. For a classical field $\psi(\mathbf{x}, t)$ they are heuristically defined as

Parity: $\psi(\mathbf{x}, t) \to M_P \psi(-\mathbf{x}, t)$

Charge conjugation: $\psi(\mathbf{x}, t) \to M_C \psi^*(\mathbf{x}, t)$

Time reversal: $\psi(\mathbf{x}, t) \to M_T \psi(\mathbf{x}, -t)$

where $M_P$, $M_C$, and $M_T$ are arbitrary numerical matrices. These matrices usually chosen to preserve the convention for the ordering of the field components. For example, in a Dirac spinor we often put the left-chirality components on top, but after a parity transformation the right-chirality components are on top. The matrix $M_P$, which is $\gamma_0$ in some conventions, puts the components back in the usual order. Similarly, in QED we have $M_C = -1$ for the same reason as in the quantum case. For more examples, see Ryan Thorngren's existing answer.

These classical discrete symmetries are primarily useful for doing representation theory at the level of Lagrangians, and have nothing to do with the CPT theorem. Just like the quantum discrete symmetries, one may broaden the definitions if it's convenient.

A warning: the classical discrete symmetries are often identified with quantum discrete symmetries because they both act on an object called $\psi$ in a similar way. However, the actions are rarely identical. I talk about the pitfalls involving charge conjugation in detail here.

To make things worse, one can also define discrete symmetries for first-quantized wavefunctions (also called $\psi$) or for second-quantized one-particle wavefunctions (also called $\psi$), and of course in all four cases the symmetries are defined slightly differently. So if you find anything titled something like "discrete symmetries explained intuitively!", there is a well above $3/4$ chance it's not talking about the real quantum ones at all. Be careful!

Further questions

This answer is already outrageously long, but let me answer a few questions from the OP.

Do P̂ , Ĉ , T̂ have to be their own inverses?

No, because of the extra phases I mentioned above; see this question. Again, it depends on the convention. You could take a stricter convention so that $\hat{P}$ always squares to one, but that's just not useful, because often a modified $\hat{P}$ that doesn't square to one will be conserved, and you'll want to talk about it. Also, $\hat{T}$ doesn't even square to one in nonrelativistic QM, so you really shouldn't expect it to in QFT.

Does a CP violation occur when doing a CP transformation on classical fields changes the Lagrangian? If we are free to define the numerical matrices as we like, can different choices lead to ambiguity in whether or not CP is violated?

When we talk about CP violation, we're usually concerned with baryogenesis. Since the antiparticle of a baryon has the opposite baryon number, a net baryon number violates both quantum C and quantum CP. The same logic holds for leptogenesis with lepton numbers. We are talking about quantum particles here, so we mean quantum symmetries. This statement remains true up to adjusting what C and CP mean, as long as they still flip baryon/lepton number.

Again, symmetries are chosen because they are convenient tools. If you refuse to allow extra phases, then even QED alone has both C and CP violation. But this is not a useful statement, because it is still true regardless that pure QED won't give you leptogenesis; the dynamics of a theory don't depend on what we call the symmetries. We choose to define C and CP so that they are symmetries of QED, which allow us to deduce this fact more easily.

Certainly the classical transformations relate to the QFT one in some way?

A classical symmetry of the action is promoted to a quantum symmetry of the action unless there are anomalies, so yes. The issue is that the conventions are different.

For instance, consider the theory of a single charged Weyl spinor. Classical C simply flips its chirality. Quantum C and quantum P are both not defined at all, but classical C corresponds roughly to what would have been quantum CP.

Luckily you don't have to worry about this if you just stick to scalars and vectors; it's just the spinors that are annoying. For example, the CP violation from the theta term is usually deduced by showing it's not invariant under classical CP, which equals quantum CP.

Is a pseudoscalar just a scalar with a different choice of Mp? Why does a classical choice of numerical matrix constrain allowed Lagrangian interaction terms?

Same answer as the others. You can choose to define $M_p$ however you want, but if you forbid signs you won't get a symmetry. Again, the Lagrangian actually is constrained no matter what we do, but it's easiest to see if we define a symmetry with appropriate minus signs for certain fields, called pseudoscalars, which we call parity. (Specifically, if a Lagrangian has a certain symmetry, then under RG flow only terms with that symmetry are generated. That means we should only write down terms respecting the symmetry. But the RG flow calculation works the same even if we don't know the symmetries are there.)

You might ask: given this freedom in redefinition, would the world really look the same if we reflected it inside-out about the origin? Which parity is the true, physical parity? Since nobody can ever actually do this, it's a meaningless question.