August 2017

Thursday 31 August 2017

newtonian mechanics - Why don't we consider jerk in physics classes?

When I got more into physics, I started asking myself if just like acceleration represents the growth of speed, something else could also represent the growth of acceleration itself. And it came that is exists and is called jerk. Before I thought about this, I thought that acceleration was the highest derivative that made sense on physics.

My question is, why can we perform accurate calculations without account for jerk? I never needed it in a single physics class! I've also read that it is seldom used.

Also, while position, speed and acceleration seem familiar to me due to everyday experience, why doesn't jerk seem familiar to me? I never tell someone when driving a car "you're jerking too much", whereas I could say "you are accelerating too much".

Is it that jerk has little influence because it is almost constant? Even if it is not, what about higher order derivatives? Why don't derivatives higher than acceleration have much influence on calculations? We perform a lot of reasoning in everyday life without resorting to it!

Wednesday 30 August 2017

How do you add temperatures?

This will probably be considered very simple, but I am just a beginner:

I'm developing a software application where temperatures need to be added and subtracted. Some temperatures are in Celsius, some in Kelvin. I know how to convert to/from Kelvin (273.15), but how should one go about adding and subtracting these? Should everything be converted to Celsius first?

For example:

0°C + 0°C = 0°C
0°C + 500°C = 500°C

But:

0°C + 273.15K = ?

If we put everything in Kelvin, we get:

273.15K + 273.15K = 546.3K

If we put everything in Celsius, we get:

0°C + 0°C = 0°C

But obviously, 546.3K isn't equals to 0°C.

Now, you might say I can't add temperature to temperatures (but should be adding energy or something? not sure). But the reason I'm doing this is because we need to interpolate. I have a collection of key-value-pairs, like this:

973K  -> 0.0025
1073K -> 0.0042
1173K -> 0.03
1273K -> 0.03

Now I need to get the value for 828°C. So I need to interpolate, which means adding/subtracting values.

I hope I'm making sense.

Answer

You may always add the numbers in front of the units, and if the units are the same, one could argue that the addition satisfies the rules of dimensional analysis.

However, it still doesn't imply that it's meaningful to sum the temperatures. In other words, it doesn't mean that these sums of numbers have natural physical interpretations. If one adds them, he should add the absolute temperatures (in kelvins) because in that case, one is basically adding "energies per degree of freedom", and it makes sense to add energies.

Adding numbers in front of "Celsius degrees", i.e. non-absolute temperatures, is physically meaningless, unless one is computing an average of a sort. This is a point that famously drove Richard Feynman up the wall. Read Judging books by their covers and search for "temperature". He was really mad about a textbook that wanted to force children to add numbers by asking them to calculate the "total temperature", a physically meaningless concept.

It only makes sense to add figures with the units of "Celsius degrees" if these quantities are inteprreted as temperature differences, not temperatures. As a unit of temperature different, one Celsius degree is exactly the same thing as one kelvin.

If you interpolate or extrapolate a function of the temperature, $f(T)$, you do it as you would do it for any other function, ignoring the information that the independent variable is the temperature. Results of simplest extrapolation/interpolation techniques won't depend on the units of temperatures you used.

newtonian mechanics - velocity, mass and force

Is there an equation linking velocity, mass and force? I want to find the maximum force that an object of any given mass could exert. Assuming, the object with the given velocity stops dead in an instant, I thought of using F=ma but there is no time frame for the slowing down of the object since it's instant. Thank you

Answer

Change in momentum $\Delta m v$ is equal to impulse $F \Delta t$, the force multiplied with the time that force acts on the body.

So to stop in an instant, the force is very large.

condensed matter - Derivation of the gradient expansion of the Keldysh nonlinear sigma model for disorder metals

My confusion relates to Appendix C of this this paper although the same derivation is presented in many others. When deriving the gradient expansion of this term arrives at a term quadratic in the scalar potential, which takes the form $\mathrm{Tr}V^\alpha \Upsilon^{\alpha\beta} V^\beta$ where $\alpha,\beta \in \{\mathrm{c}l, \mathrm{q}\}$ are indices in the Keldysh matrix space, the trace is over space and time indices, and $$\Upsilon^{\alpha\beta}(\omega) = - \frac{1}{2}\sum_{\mathbf{p}} \mathrm{Tr}_K \mathcal{G}(\mathbf{p}, \epsilon+\omega/2)\gamma^\alpha\mathcal{G}(\mathbf{p}, \epsilon-\omega/2)\gamma^\beta$$ with $\mathrm{Tr}_K$ the trace over Fermionic Keldysh indices, $\gamma^{cl} = \sigma^0$, $\gamma^q=\sigma^1$ are matrices in the Fermionic Keldysh space, and $\mathcal{G}$ is the disorder averaged Fermionic Green's function. In particular, we write the Fermionic Green's function as $$\hat{\mathcal{G}} = \frac{1}{2}\mathcal{G}^R \left(1 + \hat{\Lambda}\right) + \frac{1}{2}\mathcal{G}^A \left(1 - \hat{\Lambda}\right)$$ where $\mathcal{G}^{R(A)} = [\epsilon - \xi \pm \frac{i}{2\tau}]^{-1}$ is the retarded (advanced) disorder averaged Green's function and $\Lambda$ is the stationary saddle point solution of the non-linear sigma model $$\hat{\Lambda} = \begin{pmatrix} 1^R&2F\\ 0&-1^A \end{pmatrix}$$ c.f. Eq. 165. This expression is then evaluated to be $i\nu \hat{\sigma}_1^{\alpha\beta}$ where $\sigma_1$ is the first Pauli matrix in the bosonic Keldysh space and $\nu$ is the density of states at the Fermi surface.

So far so good. My confusion is the decision in Appendix C to only include the products $\mathcal{G}^R\mathcal{G}^R$ or $\mathcal{G}^A\mathcal{G}^A$ in the integral. Using the above parametrization of the Keldysh Green's function one can see that $\Upsilon^{\mathrm{cl},\mathrm{cl}}=0$ but each of the other components are not identically zero and contain $(\mathcal{G}^R)^2$, $(\mathcal{G}^A)^2$, and $\mathcal{G}^R\mathcal{G}^A$ terms. I don't follow the argument for why only the R-R terms should be included and in fact in the other parts of the gradient expansion it is instead the R-A terms which are kept and the R-R and A-A terms vanish due to the analytic structure of the Green's functions. Furthermore, it is not clear to me why the q-q component is zero. Am I missing something about the analytical structure of the integrands or is there a physical argument for this?

dimensional analysis - About the dimension of the SI units vector space

We know that the set of fundamental and derived physical units can be structured as a vector space over the rational numbers. In the International System of Units the dimension of this space is $7$ ( the seven foundamental units form a basis). This number has some special physical significance or it is simply a result of historical events and practical conveniences?

Answer

This number of “independent dimensions” is physically meaningless and just a historical convention. For example, we don’t have to treat distance and time as independent dimensions. We could decide to measure both in seconds, where “1 second of distance” is the distance light travels in vacuum in one second. Physicists do this kind of thing all the time when they use “natural units” with $c=1$ and/or $\hbar=1$ and/or $G=1$ and/or $k=1$, etc.

gravity - Orbit in the vacuum

As the space is a vacuum and there is no friction in space, Can we assume that, if we place an object in gravity in exactly the right distance from a planet with gravity and in the right acceleration, it will orbit indefinitely, or until another object with a gravitational force will interfere?

electromagnetism - Can the Earth's magnetic field be used to generate electricity?

Since the Earth has a magnetic field, can it, in theory, be run through a conductive metal coil to create electricity?

Answer

Not really. A magnetic field alone doesn't create electricity. A changing magnetic field does. The Earth's magnetic field does change a tiny bit but not enough to really generate much.

The other option is to move the inductor in the magnetic field. The Earth's magnetic field is quite homogeneous over short distances though so the coil would need to move fast and very far to generate much. This would use more energy than it creates (at least on the surface of the Earth).

Several years back there was an experiment (the Space Tether Experiment) to drag a conductor through the Earth's magnetic field with the Space Shuttle. I don't know how viable this is though because I think it saps orbital energy.

Tuesday 29 August 2017

general relativity - Free-fall path into a black hole in Kruskal Coordinates

If an object at t=0 begins to free-fall into a black hole from X in Kruskal coordinates (https://en.wikipedia.org/wiki/Kruskal%E2%80%93Szekeres_coordinates), what does its path on the Kruskal-Szekeres diagram look like? Is it a hyperbola, or a straight line, or something else?

Answer

It looks like this picture on the right from Misner-Thorne-Wheeler:

enter image description here

It's almost a straight line.

But can I add this: I don't like Kruskal-Szekeres coordinates at all. Take a look at the picture on the left. That shows the object's path on the Schwarzschild diagram. Note how it's truncated vertically? The vertical axis is the time axis. In order to cross the event horizon at r=2M, the infalling body allegedly goes to the end of time and back$^*$. The horizontal axis is the spatial axis. Look across from right to left near where it says τ=33.3M. The infalling body is in two places at once, just like the elephant and the event horizon. Also note where the Wikipedia Kruskal-Szekeres article says this: "Note GM is the gravitational constant multiplied by the Schwarzschild mass parameter, and this article is using units where c = 1". That just doesn't square with what Einstein said: "As a simple geometric consideration shows, the curvature of light rays occurs only in spaces where the speed of light is spatially variable". You'll be aware that the coordinate speed of light at the event horizon is zero? So a light clock stops? IMHO Kruskal-Szekeres airbrush over this by effectively putting a stopped observer in front of a stopped light-clock and claiming he still sees it ticking normally.

$*$ _{Seriously. You can even read about this in the mathspages Formation and Growth of Black Holes: "the infalling object traverses through infinite coordinate time in order to reach the event horizon, and then traverses back through an infinite range of coordinate times until reaching r = 0 (in the interior) in a net coordinate time that is not too different from the elapsed proper time".}

hilbert space - Should it be obvious that independent quantum states are composed by taking the tensor product?

My text introduces multi-quibt quantum states with the example of a state that can be "factored" into two (non-entangled) substates. It then goes on to suggest that it should be obvious¹ that the joint state of two (non-entangled) substates should be the tensor product of the substates: that is, for example, that given a first qubit

$$\left|a\right\rangle = \alpha_1\left|0\right\rangle+\alpha_2\left|1\right\rangle$$

and a second qubit

$$\left|b\right\rangle = \beta_1\left|0\right\rangle+\beta_2\left|1\right\rangle$$

any non-entangled joint two-qubit state of $\left|a\right\rangle$ and $\left|b\right\rangle$ will be

It seems to me there is some implicit understanding or interpretation of the coefficients $\alpha_i$ and $\beta_i$ that is used to arrive at this conclusion. It's clear enough why this should be true an a classical case, where the coefficients represent (where normalized, relative) abundance, so that the result follows from simple combinatorics. But what accounts for the assertion that this is true for a quantum system, in which (at least in my text, up to this point) coefficients only have this correspondence by analogy (and a perplexing analogy at that, since they can be complex and negative)?

Should it be obvious that independent quantum states are composed by taking the tensor product, or is some additional observation or definition (e.g. of the nature of the coefficients of quantum states) required?

^{1: See (bottom of p. 18) "so the state of the two qubits must be the product" (emphasis added).}

Answer

Great question! I don't think there is anything obvious at play here.

In quantum mechanics, we assume that that state of any system is a normalized element of a Hilbert space $\mathcal H$. I'm going to limit the discussion to systems characterized by finite-dimensional Hilbert spaces for conceptual and mathematical simplicity.

Each observable quantity of the system is represented by a self-adjoint operator $\Omega$ whose eigenvalues $\omega_i$ are the values that one can obtain after performing a measurement of that observable. If a system is in the state $|\psi\rangle$, then when one performs a measurement on the system, the state of the system collapses to one of the eigenvectors $|\omega_i\rangle$ with probability $|\langle \omega_i|\psi\rangle|^2$.

The spectral theorem guarantees that the eigenvectors of each observable form an orthonormal basis for the Hilbert space, so each state $\psi$ can be written as $$ |\psi\rangle = \sum_{i}\alpha_i|\omega_i\rangle $$ for some complex numbers $\alpha_i$ such that $\sum_i|\alpha_i|^2 = 1$. From the measurement rule above, it follows that the $|\alpha_i|^2$ represents the probability that upon measurement of the observable $\Omega$, the system will collapse to the state $|\omega_i\rangle$ after the measurement. Therefore, the numbers $\alpha_i$, although complex, do in this sense represent "relative abundance" as you put it. To make this interpretation sharp, you could think of a state $|\psi\rangle$ as an ensemble of $N$ identically prepared systems with the number of $N_i$ elements in the ensemble corresponding to the state $|\omega_i\rangle$ equaling $|\alpha_i|^2 N$.

Now suppose that we have two quantum systems on Hilbert spaces $\mathcal H_1$ and $\mathcal H_2$ with observables $\Omega_1$ and $\Omega_2$ respectively. Then if we make a measurement on the combined system of both observables, then system 1 will collapse to some $|\omega_{1i}\rangle$ and system 2 will collapse to some state $|\omega_{2 j}\rangle$. It seems reasonable then to expect that the state of the combined system after the measurement could be any such pair. Moreover, the quantum superposition principle tells us that any complex linear combination of such pair states should also be a physically allowed state of the system. These considerations naturally lead us to use the tensor product $\mathcal H_1\otimes\mathcal H_2$ to describe composite system because it is the formalization of the idea that the combined Hilbert space should consists of all linear combinations of pairs of states in the constituent subsystems.

Is that the sort of motivation for using tensor products that you were looking for?

general relativity - The role of the affine connection the geodesic equation

I apologise in advance that my knowledge of differential geometry and GR is very limited. In general relativity the equation of motion for a particle moving only under the influence of gravity is given by the geodesic equation:

$$ \ddot{x}^\lambda + \Gamma^\lambda_{\mu\nu}\dot{x}^\mu\dot{x}^\nu =0. $$

I am looking for a conceptual description of the role of the affine connection, $\Gamma^\lambda_{\mu\nu}$ in this equation. I understand that it is something to do with notion of a straight line in curved space.

Comparing it to the equation for a free particle according to Newtonian gravity:

$$ \ddot{x}_i = -\nabla\Phi, $$

Then it kind of looks like the affine connection is our equivalent of how to differentiate, except that our scalar field is now some kind of velocity?

Answer

You're on the right track, but there's more that can be said. For an introduction to this topic, I highly recommend Sean Carroll's Spacetime and Geometry, which I'll follow below for the purpose of illustrating where that $\Gamma$ comes from. The book grew out of lecture notes, the relevant chapter of which can be found online here.

Note: This is long, but only because each step has been broken down into very simple parts.

You want a directional derivative - something that tells you how a tensor changes as you move along some path in your manifold. Just like in standard multivariable calculus, you define this as the sum of the derivatives of your object in each direction, weighted by how much your coordinate is changing in that direction: $$ \frac{\mathrm{D}}{\mathrm{D}\lambda} := \frac{\mathrm{d}x^\mu}{\mathrm{d}\lambda} \nabla_\mu. $$ Here $\lambda$ parameterizes your path, and the appropriate derivative to use is the covariant derivative. The scalar function $x^\mu$ is being differentiated with the standard directional derivative: $$ \frac{\mathrm{d}}{\mathrm{d}\lambda} := \frac{\mathrm{d}x^\mu}{\mathrm{d}\lambda} \partial_\mu. $$ Note that this is consistent with the fully covariant directional derivative in the case of scalar functions, and, despite appearances, this is not a circular definition (you should be able to differentiate known functions of $\lambda$ with respect to $\lambda$).

The affine connection is simply defined as the set of coefficients needed to augment the partial derivative in order to make a covariant derivative: $$ \nabla_\mu V^\nu = \partial_\mu V^\nu + \Gamma^\nu_{\mu\lambda} V^\lambda $$ for any vector $\vec{V}$. You don't even need to know how to generalize this to covariant derivatives of arbitrary tensors.

The geodesic equation arises from imposing the requirement that your curve parallel-transport its own tangent vector.¹ That is, the directional derivative of the tangent vector along the direction it is pointing vanishes. A path will be a set of smooth functions \begin{align} x^\mu : \mathbb{R} & \to M \\ \lambda & \to x^\mu(\lambda), \end{align} one for each coordinate $\mu$. If you denote the tangent vector at $\vec{x}(\lambda)$ by $\vec{T}(\lambda)$, then $$ T^\mu = \frac{\mathrm{d}x^\mu}{\mathrm{d}\lambda}. $$

Putting all this together, we have \begin{align} 0 & = \frac{\mathrm{D}}{\mathrm{D}\lambda} \left(\frac{\mathrm{d}x^\sigma}{\mathrm{d}\lambda}\right) \\ & = \frac{\mathrm{d}x^\mu}{\mathrm{d}\lambda} \nabla_\mu \left(\frac{\mathrm{d}x^\sigma}{\mathrm{d}\lambda}\right) \\ & = \frac{\mathrm{d}x^\mu}{\mathrm{d}\lambda} \left(\partial_\mu \left(\frac{\mathrm{d}x^\sigma}{\mathrm{d}\lambda}\right) + \Gamma^\sigma_{\mu\nu} \frac{\mathrm{d}x^\nu}{\mathrm{d}\lambda}\right) \\ & = \frac{\mathrm{d}^2x^\sigma}{\mathrm{d}\lambda^2} + \Gamma^\sigma_{\mu\nu} \frac{\mathrm{d}x^\mu}{\mathrm{d}\lambda} \frac{\mathrm{d}x^\nu}{\mathrm{d}\lambda} \end{align}

After all this (possibly too much) algebra, the take-home message is that the connection arose from the difference between partial and covariant differentiation along a curve. It's not that we want our curve's tangent to never point in a different coordinate-dependent direction ($\ddot{x}^\lambda = 0$, which would be sufficient for "straight line" in Euclidean space). Rather we want the change in the coordinate-dependent direction we are pointing to be compensated by the fact that our tangent space is rotating with respect to our coordinates as we move along the curve.

¹ Some people define the geodesic in a more global sense, as the path that extremizes the arc length between two points. These definitions agree if and only if the affine connection you are using is indeed the metric compatible, torsion free Christoffel connection.

Monday 28 August 2017

particle physics - Quantum tunneling is faster than light travel?

Quantum tunneling is faster than light travel ? My reasoning is that the particle cannot be detected inside the tunnel so if it travels from A to B it must be instantly going from A to B , hence faster than light travel ?

This seems legit for the particle interpretation. And also for waves imho. Some people mention the uncertainty principle but I do not see how that explains it.

Answer

Faster-than-light tunneling appears only in non-relativistic quantum mechanics. As soon as you introduce the concept of relativity to QM, faster-than-light tunneling disappears.

Sunday 27 August 2017

conventions - Why have they chosen this direction for current in the RC circuit? Seems pretty artificial to me

You can ignore the spanish, but you can imagine what's going on

The thing is, they use this direction for current flow to derive the equation

$-iR-\frac{q}{c}=0$ and then derive the equation $q(t)=Q_o e^\frac{-t}{RC}$ from the differential equation $\frac{dq}{dt}=\frac{-q}{RC}.$

This seems pretty artifical to me, because if I use the other direction for current, which seems more natural to me, I don't get the same equation for $q(t)$. Any ideas?

Another thing which seems artificial is that their equation produces a "negative current" which is really just there to satisfy the =0 equality. And which means, and wrongly, that the resistor is giving energy to the system and not the exterior. (That's what I think, I may be wrong).

air - A flying fly inside a sealed box on a scale

suppose there is a scale able to measure weight with an uncertainty of $10^{-9}kg$ . On the scale, an airtight plastic chamber is placed. Initially, a fly of mass $10^{-5}kg$ is sitting at the bottom of the chamber, which sits on the scale. At a later point in time the fly is flying around the chamber. Will there be a difference in the observed weight as measured by the scale when the fly is sitting at the bottom of the chamber compared to when it is flying around the chamber at some point in time? If so, what does the value of this difference depend on (I am most concerned with the case where the fly has not touched any surface of the container in enough time for the scale to reach some equilibrium value (or do the pressure variations induced from the flies wings cause constant fluctuations in the scale)?

Answer

If you had a perfect scale, the reading would fluctuate based on

$$\delta w = m\ddot{x}_{cm}$$

$\delta w$ is the size of the fluctuation in the reading, $m$ the total mass on the scale (including fly and air), and $\ddot{x}_{cm}$ the acceleration of the center of mass.

Integrating over time,

$$\int_{time} \delta w(t) = m\Delta(\dot{x}_{cm})$$

Here, $\Delta(\dot{x}_{cm})$ is the change in velocity of the center of mass over the period you observe the readings. Because the velocity of the center of mass cannot change very much, if you integrate the fluctuations over time, you wind find that their average tends towards zero. If the fly begins and ends in the same place and the air is still, the fluctuations integrate out to exactly zero.

Whenever the fly is accelerating up, we expect the reading to be a little higher than normal. When the fly accelerates down, we expect the reading to be a little lower than normal. If the fly hovers in a steady state, the reading will be the same as if the fly were still sitting on the bottom.

A real scale cannot adjust itself perfectly and instantaneously, so we would need to know more details of the scale to say more about the real reading.

Dispersion of Probability Wave Packets

A picture in my text book shows a three dimensional wave packet dispersing, "resulting from the fact that the phase velocity of the individual waves making up the packet depends on the wavelength of the waves." Does this mean a particle moving through space has a gradually diminishing probability of being in it's location? Also, why does the wavelength of the the wave change the speed for a probability wave? I thought the dispersion was characteristic of the medium, and I thought for things like vacuums&light, air&sound, and also probability waves and space, that they wouldn't disperse.

Answer

The dispersion of a wave is a result of the relationship between its frequency and its wavelength, which is appropriately known as the dispersion relation for the wave. For classical waves this depends on the medium: light, for example, will be dispersionless in vacuum and will have dispersion inside material media because the medium affects the dispersion relation. Quantum mechanical waves, on the other hand, have dispersion fundamentally built in.

Let's have an equation look at how light behaves. The dispersion relation for light is $$\omega=\frac c n k,$$ where $k=2\pi/\lambda$ is the wavenumber, $c$ is the speed of light in vacuum, and $n=\sqrt{\varepsilon_r\mu_r}$ is the medium's refractive index. In vacuum, $n\equiv1$ and there is no dispersion: the phase and group velocities, $\frac \omega k$ and $\frac{d\omega}{dk}$ are equal, constant, and independent of $k$, which are the mathematical conditions for dispersionless waves. In material media, though, $n$ will depend on the wavelength - it has to depend on the wavelength - and there will be dispersion.

Matter waves, on the other hand, are quite different. What are their frequency and wavelength, anyway? Well, the first is given by Planck's postulate that $E=h\nu$, and the second by de Broglie's relation $p=h/\lambda$; both should really be phrased as $$E=\hbar \omega\text{ and }p=\hbar k.$$ How are $\omega$ and $k$ related? The same way that $E$ and $p$ are: for nonrelativistic mechanics, as $E=\frac{p^2}{2m}$. Thus the dispersion relation for matter waves in free space reads $$\hbar\omega=\frac{\hbar^2k^2}{2m},\text{ or }\omega=\frac \hbar{2m}k^2.$$ Note how different this is to the one above! The phase velocity is now $v_\phi=\frac\omega k=\frac{\hbar k}{2m}$, and it is different for different wavelengths.

Now, why exactly does that imply dispersion?

Let's first look at the phase velocity, which is the velocity of the wavefronts, which are the planes that have constant phase. Since the phase goes as $e^{i(kx-\omega t)}$, the phase velocity is $\omega/k$. This, of course, is for a single plane wave, and doesn't apply to a general wavepacket for which phase is not as well defined, and for which the different wavefronts might be doing different speeds.

How then do we deal with wavepackets? The approach that works best with the formalism above is to think of a wavepacket $\psi(x,t)$ as a superposition of different plane waves $e^{i(kx-\omega t)}$, each with its own weight $\tilde\psi(k)$: $$\psi(x,0)=\int_{-\infty}^\infty \frac{dk}{\sqrt{2\pi}} \tilde \psi(k)e^{ikx}.\tag{1}$$ Now, if all the different plane-wave components $\tilde \psi(k)e^{ikx}$ moved at the same speed then their sum would just move at that speed and would not change shape.

(More mathematically: if $v_\phi=\omega/k$ is constant, then $$\psi(x,t)=\int_{-\infty}^\infty \frac{dk}{\sqrt{2\pi}} \tilde \psi(k)e^{i(kx-\omega t)} =\int_{-\infty}^\infty \frac{dk}{\sqrt{2\pi}} \tilde \psi(k)e^{ik(x-v_\phi t)} =\psi(x-v_\phi t,0),$$ so the functional form is preserved.)

For a matter wave in free space, however, the different phase speeds are not the same, and the different plane-wave components move at different speed. It is the interference of all these different components that makes them sum to $\psi(x,0)$ in equation (1), and if you mess with the relative phases you will get a different sum. Thus, with longer waves moving slower and shorter ones going faster, wavepackets with lots of detail encoded in long high-$k$ tails of their Fourier transform will change shape very fast.

In general it is hard to predict what the evolution of a wavepacket will do to it in detail. However, it is very clear that all wavepackets will (eventually) spread, since some components are going faster than others. Since the total probability is conserved, this must mean that the probability density will in general decrease.

If I put a particle with zero net momentum localized in some interval, then the probability of it remaining there will decrease. Note, though, that this is no surprise! The Uncertainty Principle demands that there be uncertainty in the particle's momentum. There is then some chance that the particle was moving to the left or to the right, so who's surprised to eventually find it out of the original interval?

temperature - Why does William Herschel's experiment show red light as warmer than blue?

Why does William Herschel's experiment show red light as warmer than blue if blue light is higher energy?

Here is an explanation of Herschel's experiment http://www.ipac.caltech.edu/outreach/Edu/Herschel/backyard.html

In short, Herschel placed multiple thermometers on the light separated by a prism. This led to the discovery of infrared light as he placed his control thermometer outside of the visible spectrum on the red side.

You may see similar apparatus to his here:

As you can see, the red thermometer is reading a higher temperature.

My guess is that blue is scattered more, therefor the thermometer for red absorbs more light.

Saturday 26 August 2017

spacetime - How does gravitational wave compress space time?

My question came from the talk of how gravitational wave stretches and compresses space time.

Say there are two protons that are 1 centimeters apart, as a G-wave passes through them, would the electrostatic force experienced by the protons change?

What about Plank's constant? If two particles are x number of Plank distances apart, is the new x smaller than the old x?

Answer

Suppose we choose our coordinates so both protons are on the $x$ axis, at $x = -0.5$cm and $x = +0.5$cm:

The distance $d$ between the protons is obviously 1cm - well, that may seem obvious but actually it's only true in flat spacetime. More generally the geometry of spacetime is described by a quantity called the metric tensor, $g_{\alpha\beta}$, and the proper distance between the two protons is given by:

$$ d = \int_\text{x=-0.5cm}^\text{x=0.5cm} \sqrt{g_\text{xx}}\,dx $$

In ordinary flat spacetime the value of $g_\text{xx}$ is constant at one, and the integral turns into:

$$ d = \int_\text{x=-0.5cm}^\text{x=0.5cm} dx = 1 \,\text{cm} $$

as we expect. Now suppose we have a gravitational wave coming out of the screen towards you. This causes an oscillating change in the spacetime geometry that looks like this (picture from Wikipedia):

Whatb this is supposed to illustrate is that the value of $g_\text{xx}$ (and $g_\text{yy}$) oscillates with time, so it alternately becomes greater than one and less that one. In that case our integral becomes:

$$ d(t) = \int_\text{x=-0.5cm}^\text{x=0.5cm} \sqrt{g_\text{xx}(t)}\,dx $$

and our distance $d(t)$ is no longer a constant but oscillates above and below $d= 1\text{cm}$ as the gravitational wave passes through.

This isn't some mathematical trick, the gravitational wave really does cause the distance $d$ to change with time. If you shone a light ray between the protons and timed how long it took you'd find that time oscillated as well. Assuming the oscillation is slow compared to the time the light rays takes, the time the light ray would take to travel between the protons, $T$ would be:

$$ T(t) = \frac{d(t)}{c} $$

The electrostatic force between the protons would also change with time:

$$ F(t) = \frac{ke^2}{d^2(t)} $$

However Planck's constant is just a constant and this wouldn't be changed by the gravitational wave.

cosmology - squeezed radiation astronomy

Squeezed electromagnetic vacuum does have a renormalized energy density smaller than the vacuum. So it makes it in my opinion a inconspicuous candidate for a dark energy carrier.

Are there observatories at the moment attempting to detect squeezed radiation from astrophysical and cosmic background sources? If not, What sort of equipment do you need to measure squeezed vacuum radiation?

reference frames - Deriving the law of reflection for a moving mirror in relativity

I am following a training course and came across this proof, from my colleague, that the ordinary law of reflection $\theta_i = \theta_r$ does not hold in relativity:

Let $S$ be a perfectly reflecting mirrror.

Obviously if it is at rest the canonical law of reflection $\theta_i=\theta_r$ holds. Now suppose the mirror is moving with velocity $\overline v$ in the frame $\Sigma$, and let the frame of the mirror be $\Sigma'$. Then in $\Sigma'$,

$$\theta'_i\equiv\theta'_r$$

which means that

$$ \cos \theta =\frac{\cos \theta'_i+\beta}{1+\beta \cos \theta'_i}. $$

For the reflected ray changes $\theta$ before was $\theta_i $ is now $\theta =\pi-\theta_r $, similarly $\theta'= \pi- \theta'_r $ waves $ \cos \theta = - \cos (\theta_r)$ which is replaced provides:

$$ -\cos \theta_r =\frac{-\cos \theta'_r+\beta}{1-\beta \cos \theta'_r}, \quad \cos \theta_r =\frac{\cos \theta'_r-\beta}{1-\beta \cos \theta'_r} $$

$$\tan \left(\frac{\theta_i}{2}\right) =\sqrt{\frac{1-\beta}{1+\beta}}\tan \left(\frac{\theta'_i}{2}\right), \qquad \tan \left(\frac{\theta_r}{2}\right) =\sqrt{\frac{1+\beta}{1-\beta}}\tan \left(\frac{\theta'_r}{2}\right) $$

hence

$$ \tan\left( \frac{\theta_i}{2}\right)=\left(\frac{1-\beta}{1+\beta}\right)\tan \left( \frac{\theta_r}{2} \right) $$ and the law of reflection is no longer valid.

I don't think this derivation is very clear. Is there anyone who can help me to understand the steps in this unclear derivation, or recommend a book where I can find a proof of the final result?

If more information is needed, a picture of the derivation can be found here and here.

Answer

There are two inertial frames of reference involved; a "primed" frame in which the mirror is at rest, and an "unprimed" frame in which the mirror is moving in the direction opposite the mirror's surface normal. Label the coordinates such that the direction the mirror is moving in (i.e. the direction opposite the surface normal) is the $x$ and $x'$ direction, the other direction in the plane containing both the incident and reflected beams is the $y$ and $y'$ direction, and the direction that's unimportant to this problem is the $z$ and $z'$ direction.

In either frame, the components of an incident photon's three-velocity are given by the projections of the three-velocity onto the coordinate axes. I.e., the three-velocity of an incident photon, as measured in each of the two frames of reference, is

$$\mathbf{u}_i=\left(\begin{array}{c}c\cos\theta_i \\ c\sin\theta_i \\ 0\end{array}\right)$$

and

$$\mathbf{u}'_i=\left(\begin{array}{c}c\cos\theta'_i \\ c\sin\theta'_i \\ 0\end{array}\right)\ \ ,$$

where $\theta_i$ and $\theta'_i$ are the angle of incidence as measured in each of the two frames of reference, and $c$ of course is the speed of light.

The derivation in the question uses a version of the relativistic velocity addition formula, a proof of which can be found in the linked-to Wikipedia article. That formula states that if a primed frame is moving with speed $v$ in the $x$ direction as measured in an unprimed frame, speeds along the $x$ and $x'$ direction, as measured in each of the two frames, are related as

$$u_x=\frac{u'_x+v}{1+v u'_x/c^2}\ \ .$$

Plugging in the $x$ and $x'$ components of $\mathbf{u}_i$ and $\mathbf{u}'_i$ into the relativistic velocity addition equation, i.e. setting

$$u_x=c \cos\theta_i$$

and

$$u'_x = c \cos\theta'_i$$

gives

$$c \cos\theta_i=\frac{c \cos\theta'_i+v}{1+v c \cos\theta'_i/c^2}\ \ .$$

Dividing both sides of that equation by $c$ gives

$$\cos\theta_i=\frac{\cos\theta'_i+\beta}{1+\beta \cos\theta'_i}\ \ ,$$

where $\beta = v/c$.

We can perform a similar procedure with a reflected photon. The three-velocity of a reflected photon, as measured in each of the two frames of reference, is

$$\mathbf{u}_r=\left(\begin{array}{c}-c\cos\theta_r \\ c\sin\theta_r \\ 0\end{array}\right)$$

and

$$\mathbf{u}'_r=\left(\begin{array}{c}-c\cos\theta'_r \\ c\sin\theta'_r \\ 0\end{array}\right)\ \ ,$$

where $\theta_r$ and $\theta'_r$ are the angle of reflection as measured in each of the two frames of reference. In the case of the reflected photon, the values we plug in to the relativistic velocity addition formula are

$$u_x=-c \cos\theta_r$$

and

$$u'_x = -c \cos\theta'_r\ \ ,$$

giving

$$- c \cos\theta_r=\frac{-c \cos\theta'_r+v}{1+v (-c \cos\theta'_r)/c^2}\ \ .$$

Dividing both sides of that equation by $-c$ gives

$$\cos\theta_r=\frac{\cos\theta'_r-\beta}{1-\beta \cos\theta'_r}\ \ .$$

The above formulas for $\cos\theta_i$ and $\cos\theta_r$ are inconvenient for the purposes of comparing $\theta_i$ and $\theta_r$, largely due to $\theta'_i$ and $\theta'_r$ each appearing twice in the equations. We can arrive at simpler equations by using the tangent half-angle formula

$$\tan\left(\frac{\theta}{2}\right)=\sqrt{\frac{1-\cos\theta}{1+\cos\theta}}\ \ .$$

Applying the tangent half-angle formula to $\theta_i$ gives

$$ \begin{align} \tan\left(\frac{\theta_i}{2}\right)&=\sqrt{\frac{1-\cos\theta_i}{1+\cos\theta_i}}\\ &=\sqrt{\frac{1-\frac{\cos\theta'_i+\beta}{1+\beta \cos\theta'_i}}{1+\frac{\cos\theta'_i+\beta}{1+\beta \cos\theta'_i}}}\\ &=\sqrt{\frac{1+\beta \cos\theta'_i-(\cos\theta'_i+\beta)}{1+\beta \cos\theta'_i+\cos\theta'_i+\beta}}\\ &=\sqrt{\frac{(1-\beta)(1-\cos\theta'_i)}{(1+\beta)(1+\cos\theta'_i)}}\\ &=\sqrt{\frac{1-\beta}{1+\beta}}\sqrt{\frac{1-\cos\theta'_i}{1+\cos\theta'_i)}}\\ &=\sqrt{\frac{1-\beta}{1+\beta}}\tan\left(\frac{\theta'_i}{2}\right)\ \ . \end{align} $$

Applying the tangent half-angle formula to $\theta_r$ would proceed similarly, except that $\theta'_i$ is everywhere replaced by $\theta'_r$, and $\beta$ is everywhere replaced by $-\beta$. The result is thus

$$\tan\left(\frac{\theta_r}{2}\right)=\sqrt{\frac{1+\beta}{1-\beta}}\tan\left(\frac{\theta'_r}{2}\right)$$

$$\tan\left(\frac{\theta'_r}{2}\right)=\sqrt{\frac{1-\beta}{1+\beta}}\tan\left(\frac{\theta_r}{2}\right)\ \ .$$

But since the mirror isn't moving in the primed frame, the normal law of reflection is valid in the primed frame, $\theta'_i=\theta'_r$, and so we have

$$ \begin{align} \tan\left(\frac{\theta_i}{2}\right)&=\sqrt{\frac{1-\beta}{1+\beta}}\tan\left(\frac{\theta'_i}{2}\right)\\ &=\sqrt{\frac{1-\beta}{1+\beta}}\tan\left(\frac{\theta'_r}{2}\right)\\ &=\sqrt{\frac{1-\beta}{1+\beta}}\left(\sqrt{\frac{1-\beta}{1+\beta}}\tan\left(\frac{\theta_r}{2}\right)\right)\\ &=\left(\frac{1-\beta}{1+\beta}\right)\tan\left(\frac{\theta_r}{2}\right)\ \ , \end{align} $$

which was the result to be shown.

Note that for $0<\beta<1$,

$$0<\left(\frac{1-\beta}{1+\beta}\right)< 1\ \ ,$$

$$\tan\left(\frac{\theta_i}{2}\right)<\tan\left(\frac{\theta_r}{2}\right)$$

and $\theta_i<\theta_r$.

ADDENDUM:

As a response to the comments on this answer, the following provides additional clarification of the confusing statement in the question that "For the reflected ray changes $\theta$ before was $\theta_i $ is now $\theta =\pi-\theta_r $, similarly $\theta'= \pi- \theta'_r $ waves $ \cos \theta = - \cos (\theta_r)$":

For a three-velocity $\mathbf{u}$ in general, the $x$ component of $\mathbf{u}$, $u_x$, is given by the scalar projection of $\mathbf{u}$ onto $\hat{\mathbf{x}}$, the unit vector in the $x$ direction. As per the scalar projection Wikipedia article, the scalar projection can be expressed as

$$u_x=\left|\mathbf{u}\right|\cos\theta\ \ ,$$

where $\left|\mathbf{u}\right|$ is the length of $\mathbf{u}$, and $\theta$ is the angle between $\mathbf{u}$ and $\hat{\mathbf{x}}$.

The speed of either an incident or a reflected photon is $c$, i.e.,

$$\left|\mathbf{u}_i\right|=\left|\mathbf{u}_r\right|=c\ \ ,$$

so in either case we use $\left|\mathbf{u}\right|=c$ in the scalar projection equation for $u_x$.

The angle of incidence $\theta_i$ is defined as the angle between $-\mathbf{u}_i$ and $\hat{\mathbf{n}}$, where $\hat{\mathbf{n}}$ is the mirror surface's unit normal. But that angle is the same as the angle between $\mathbf{u}_i$ and $-\hat{\mathbf{n}}$, which is the same as the angle between $\mathbf{u}_i$ and $\hat{\mathbf{x}}$, since we've defined our coordinate system such that $\hat{\mathbf{x}}=-\hat{\mathbf{n}}$. Thus, when dealing with $\mathbf{u}_i$, we just use $\theta=\theta_i$ in the scalar projection equation for $u_x$, i.e.

$$u_x=\left|\mathbf{u}\right|\cos\theta=c\cos \theta_i\ \ ,$$

as in the original answer above.

On the other hand, the angle of reflection $\theta_r$ is defined as the angle between $\mathbf{u}_r$ and $\hat{\mathbf{n}}$. The angle between $\mathbf{u}_r$ and $\hat{\mathbf{x}}$ differs from $\theta_r$ by $\pi$ radians, because the angle between $\hat{\mathbf{n}}$ and $\hat{\mathbf{x}}$ is $\pi$ radians. Thus, when dealing with a reflected photon, we use $\theta=\pi-\theta_r$ instead of $\theta=\theta_i$ in the scalar projection equation for $u_x$,

$$u_x=\left|\mathbf{u}\right|\cos\theta=c\cos (\pi-\theta_r)=-c\cos\theta_r\ \ ,$$

as in the original answer.

The scalar projection equations for $u'_x$ are the same as the scalar projection equations for $u_x$, for the same reasons, except that they use $\theta'_i$ and $\theta'_r$ instead of $\theta_i$ and $\theta_r$.

Thursday 24 August 2017

cosmology - Did time exist before the Big Bang and the creation of the universe?

Does time stretch all the way back for infinity or was there a point when time appears to start in the universe?

I remember reading long ago somewhere that according to one theory time began shortly before the creation of the universe.

Does time have a starting point of note?

Answer

In short, we don't know. There are a few indications that time started at the big bang, or at least it had some form of discontinuity. This might be wrong though.

According to General Relativity, there is no such thing as an absolute time. Time is always relative to an observer, without the universe there would be no corresponding concept of time. All observers within the universe would have their clocks "slowed down" the nearer they are to the big bang (nearer in time). At the big bang point, their clock would stop. This said, we know that GR doesn't apply as-is all the way to the Big Bang.

Some cosmological theories like CCC predict a series of aeons and some form of cyclic universe. These predict a discontinuity (CCC predicts a conformal scale change) of time at the big bang, and at the end of the universe.

As a side note: people tend to have a special fascination with time. For all we know though, time is only relatively special. From a cosmological point of view the discussion is whether space-time existed. We are pretty sure that it was very very small at some point.

waves - An analogy for resonance?

I learned that the phenomena of resonance occurs when the frequency of the applied force is equal to the natural frequency of an object. At this point, an object vibrates with maximum amplitude.

How can a force have a frequency? My teacher used the analogy of pushing a child on a swing. The person who pushes should apply a force of the same frequency of the swing (I didn't really get that).

Following the analogy, what if the person pushes harder, wouldn't the "amplitude" of the swing be larger... I mean is there any such thing a "maximum amplitude" for a vibrating object as I can always increase the magnitude of the force.

Do all objects always vibrate with some kind of frequency? I mean if I put a speaker in front of a goblet and produce a sound having a frequency equal to the "natural frequency", it will just start vibrating with maximum amplitude?

I know I asked a lot of questions at once but I am really confused about this phenomena and just cannot get the intuition behind it.

fluid dynamics - Why does a glass bottle completely filled with water not break when hit on a nail?

Here is a video of Julius Sumner Miller's show illustrating Pascal's principle: https://www.youtube.com/watch?v=8ma4kW3xVT0

At 12:31, he states (but only sort of demonstrates) that a glass bottle completely filled with water would be able to act as a hammer, meaning he could hit the screw and the bottle would not break.

My confusion is that, according to Pascal's principle, hitting a glass bottle with a nail like Julius did would essentially be equivalent to hitting the bottle with the same nail everywhere from the inside. That would certainly shatter the bottle.

I'll provide my explanation of this (I'm probably wrong): The internal pressure of water at ~20 degrees Celsius is much less than atmospheric pressure. This means that a closed, glass bottle filled completely with water has a net force going inwards. Water can not compress, and so the pressure the water exerts on the glass bottle increases until it is not enough to deform the glass bottle. Thus there is a net force inwards on the glass bottle system which is the very lowest amount at which glass will deform.

When the glass bottle is hit by a nail, it deforms, applies a force to the water over some area, and by pascal's principle, this same pressure is applied everywhere throughout the glass bottle. However, because there is already net force inwards from the atmosphere and the water's pressure, the glass bottle does not feel a sufficient net force outwards to shatter it. Essentially, the atmosphere cushions the blow outwards that the glass bottle would otherwise feel from the water's application of outwards pressure.

Is this correct, or grossly misguided?

Edit: Another explanation I can think of is based on how the glass deforms. If, for example, the glass deformed by having the half of the surface of its cylinder deforming inward, then this would imply a relatively large force over a relatively large area, which diminishes the pressure exerted by the water outwards greatly.

Answer

Glass breaks because it is brittle instead of flexible; this means that if the shape of the glass deforms enough, if a surface bends just a little, it breaks. If the shape of the bottle doesn't change, then it won't break, no matter what forces are applied to it.

In the case of a bottle that is full of water with no air, the force of the impact with the nail causes one side of the bottle to deform. But, water is incompressible, so the water stops the side of the bottle from bending more than a negligible amount in order to keep the water volume constant (this is what "incompressible" means). Now, pressure is force divided by area, so the force driving the nail is spread out over the entire interior of the bottle by the water, so no part of the glass bends enough to break. That's why the bottle would survive.

In the case where the bottle has a bubble in it, the story would be different. Air and all other gases are very compressible. So, upon impact with the nail, the side of the bottle impacting the nail would deform. The gas would compress from the transmitted water pressure, allowing the water to move out of the way of the deforming side of the glass bottle into the volume formerly occupied by the bubble. This allows the glass to bend more, resulting in it breaking.

To compare, imagine the result of the virtual experiment with the corked bottle full of water with no bubbles at 11:44. If the professor had hit the top of the cork, the bottle would have shattered. Why? In order to stop the cork from entering the bottle (to keep the volume of water from compressing), the water would have to deliver a large force to the cork to stop it. This requires a large pressure since the area of the cork face is small ($Force = Pressure \times Area$). This pressure is transmitted to all sides of the bottle, generating an enormous force since the bottle interior has a much larger area. Air pressure outside the bottle is far too weak to prevent the bottle's sides from bowing outward and breaking. This is the basis behind the hydraulic press seen from 4:37 to 7:20.

general relativity - Avoiding Pseudo-tensors when addressing global conservation of energy in GR

Discussions about global conservation of energy in GR often invoke the use of the stress-energy-momentum pseudo-tensor to offer up a sort of generalization of the concept of energy defined in a way that it will be conserved, but this is somewhat controversial to say the least. Rather than dwell on the pseudo-tensor formalism, I was wondering if anyone had any thoughts on the following approach.

Prior to Noether's theorem, conservation of energy wasn't just a truism despite the fact that new forms of energy were often invented to keep it in place. This is because there was a still a condition the laws of nature had to satisfy, and in many ways this condition is a precursor to Noether's theorem. The condition is simply that it had to be possible to define this new energy such that no time evolution of the system would bring the system to a state where all other energy levels were the same, but this new energy had changed, or conversely, a state where this new energy remained the same, but the sum of the remaining energy had changed. For example, if I wanted to define gravitational potential energy as $GMm/r$, the laws of gravity could not allow a situation where, after some time evolution, a particle in orbit around a planet had returned to its exact position (thus maintaining the same potential) but its kinetic energy had somehow increased. This is of course just saying that it physics must be such that there can exist a conserved scalar quantity, which later in time we found out to mean time invariant laws of physics.

The reason I formulated this in a convoluted manner instead is the following. In GR we invoke pseudo-tensors because Noether's theorem tells us energy should not be conserved given that the metric is not static, but we try to recover some semblance of energy conservation by asking if the dynamics of that metric do not somehow encode a quantity somehow resembling energy. Couldn't we ask instead if some sort of equivalent to energy was conserved in GR by asking :

In GR, do there exist initial conditions $\rho(t_0,x,y,z)$ and $g_{\mu \nu}(t_0,x,y,z)$) such that after some time we recover the same $\rho(t,x,y,z) = \rho(t_0,x,y,z)$ but with a different $g_{\mu \nu}$, or conversely, we recover the same $g_{\mu \nu}$ but with different $\rho$ ?

Obviously here we only ask about $\rho$ and not the entire $T_{\mu \nu}$ tensor. For starters, setting $T_{\mu \nu}$ immediately determines $g_{\mu \nu}$ so what I'm asking would be impossible. This isn't a problem because we're looking to generalize the concept of a conserved scalar, so it makes more sense to ask about the scalar field $\rho(t,x,y,z)$ than it does to ask about the tensor field $T_{\mu \nu}$. What this means is that you'd have to play around with how the momenta change in order to find a situation in which the answer is yes.

For the Friedmann equations, the answer is obviously no so some semblance of energy conservation still exists ($\rho \propto a^{-3(1+w)}$, therefore the $ \rho$s will return to the same values if the universe ever collapses in on itself and returns to previous values of $a$, though I'm not sure if this is necessarily the case for time varying values of $w$). Can this be shown to generally be the case for General Relativity ?

particle physics - Origin of electric charge

Baryons have charges that are the result of a polynomial calculation of their building blocks (quarks)'s fractional charges. But what gives these quarks electric charges? What interactions do they have with photons? And what about leptons, like muons and electrons? How do they get their -1 charge? And then, how do positrons and anti-muons get +1 charge? Same goes for quarks: how does up-antiquarks get -2/3 charge? Or I may want to formulate my question this way:

How does any elementary particle interact with the EM-Force to get an electric charge?

heisenberg uncertainty principle - Can Zeno's Dichotomy Paradox be Resolved with Quantum Mechanics?

I would like to start off by saying this is not a philosophical question. I have a specific question pertaining to physics after the following explanation and background information, which I felt was necessary to properly formulate my question.

I have done some research on Zeno's Paradoxes as well as some of their modern day mathematical and physical "solutions." I understand that Zeno's dichotomy paradox can be explained mathematically by constructing an argument that makes use of the following infinite series: $$ \sum_{n=1}^\infty {1 \over 2^n} = 1$$ Because this series converges to $1$, then that implies that if physically traveling from some point A to another point B truly does require an infinite amount of actions in which an object travels ${1 \over 2}$ of the distance, then ${1 \over 4}$ then ${1 \over 8}$ then ${1 \over 16}$ and so on, the sum of those actions would eventually result in traveling the full distance. Mathematically, this makes perfect sense. However, knowing that the mathematics we use to describe the physical world is only a model, adopted when there is sufficient evidence to support it, could it be that the solution gained by the above argument comes to the correct conclusion using the wrong path? Using the infinite series provides a solution to the dichotomy paradox if it does indeed require an infinite amount of actions to move from point A to point B. So, then the question becomes, does it truly take an infinite amount of actions to physically move from one point to another (in which case, I believe the mathematical solution is a perfect model), or are there other phenomena at play that cannot be described in this fashion?

I have one idea pertaining to a quantum mechanical description of the situation that does not disprove, but provides an alternative to the reasoning that it requires an infinite number of actions to move from point A to point B.

Suppose we have a box of some macroscopic size (say $1m^3$), and it is in the process of moving with a constant velocity from some point A to another point B. Obviously, the box is made up of atoms, and if we zoom in far enough on the leading face of the box we could see that the motion of said box is really the motion of a very large number of atoms moving in the same direction. According to the uncertainty principle, it is impossible to determine both a particle's momentum and position simultaneously. According to the physics professor who first taught me about this, the reason for this uncertainty does not lie within our measurement methods, but rather, it is an impossibility inherent in the fact that all particles have wavelike characteristics. Thus, their position and momentum are not even clearly defined at a specific moment in time. Could it not be possible then, that if we zoomed in far enough on the leading face of our box, that we would find the face of the box does not have a definite position? If this is true, then wouldn't there be a certain point very close to the destination point, B, where the idea of cutting the remainder of the distance to the destination in half makes no difference to the particles in question? We may say that to complete the trip, we must travel another ${1 \over 2^x}$ portion of the distance we started with, but if the uncertainty in position is larger than this remaining distance, could the particle (and by the same argument, all of the particles that make up the leading face of the box), traverse the final distance to point B without physically having to travel the "infinite" amount of points remaining?

Also, I have a second idea that I want to present as an aside, something that I don't believe can be proven as of now, but I wonder if, according to current understanding, it is possible.

What if, instead of analyzing the situation at hand through the distance traveled, it was analyzed by analyzing the passage of time. Specifically, if time is quantized, would it resolve the dichotomy paradox in a physical sense? I realize that whether or not time is quantized is up for debate (there is even this question here at stack exchange that explores the idea), so I am not going to ask if time is quantized. Based off of my understanding, however, there is nothing known as of now that says time cannot be quantized (please correct me if I am wrong). Therefore, if time were quantized then could we not say that for a given constant velocity there is a minimum distance that can be traveled? A distance that corresponds to $\Delta t_{min} * v$? Would this not imply that as our moving box reached a certain distance, $d$ away from point B, and if $d < \Delta t_{min} * v$, the box would physically not be able to travel such a small distance, and in the next moment that the box moved, it would effectively be across the destination point B?

All of that being said, here is my specific question: In my above arguments, is any of my logic faulty? Is there any known law that would disprove either of these explanations for Zeno's dichotomy paradox? If so, is there a better way of physically (not mathematically) resolving the paradox?

Answer

There are a lot of comments. The end result is that your first proposition is correct as far as the physics models we have validated with experimental evidence go.

Even classically, when one has an extended body whose trajectory is assumed by the center of mass, the paradox becomes trivial, even if the center of mass never reaches the goal, half of the solid body minus an infinitesimal essentially does.

The Heisenberg uncertainty principle is involved at the elementary particle state, and yes the classical motion is no longer relevant and does not describe the behavior of nature at the microscopic scale.

As for your second proposition, what disallows discreteness of space and time is not lack of imagination. It is at the moment incompatibility of known and validated physics and mathematical models proposed with such discreteness incorporated. At the moment as far as I know, locality for Lorenz invariance, which has been validated an innumerable number of times, is the main obstacle for such theories, but also I am not aware if there exist proposals that can also incorporate the standard model of particle physics, which is an encapsulation of all the experimental measurements we have up to now. Thus discreteness in time exists in some models but is not validated by any data and is not a proposition accepted by the mainstream physics community. In any case even if you take time as a variable the argument with the Heisenberg uncertainty principle is sufficient as there also exist a delta(E)delta(t)>h_bar form of it.

electromagnetism - Is there one all encompassing electromagnetic field? Or are electromagnetic fields separate and individually generated?

Many people have been using very confusing and sometimes contradictory language when describing electromagnetic fields and electromagnetic radiation. It's going to be hard to word this question so everyone understands so I'm going to use some examples.

I want to know if there is simply one large electromagnetic field that is affected by magnets, electricity, and electromagnetic radiation. If not, does that mean that each magnet, each electric flow, and each photon of electromagnetic radiation creates a separate electromagnetic field, all of which overlap but do not interfere with each other?

My understanding is that if electromagnetic fields or waves can interfere with each other, then they are really all just manipulations of one large electromagnetic field that fills all of space much like the Higgs field or spacetime does. In that understanding, would photons (electromagnetic waves) simply be waves in that field?

Here is an experiment example. Say I did a variation of the double-slit experiment. In this variation, instead of having one source that emits electromagnetic radiation from behind the two slits, there are two sources, one at each slit. In essence, I'm performing two single-slit experiments right next to each other projecting onto the same surface. Instead of firing only one photon at a time, each source fires a single photon at a time, both sources firing at the same time.

I have two predictions for what would happen. One for if there is one large electromagnetic field that fills all of space, and one for if each electromagnetic field is separate.

The top image is my prediction of what the result would look like if each field is separate, and by implication the waves do not interfere. The bottom image is my prediction if there is a single electromagnetic field and the waves in the field can interfere with each other.

I guess a good change in vocabulary could help. I imagine an "electromagnetic continuum" much like the spacetime continuum. In the spacetime continuum, a gravitational field would simply be a bending/warping of the spacetime continuum. In that sense, an electromagnetic field would be a "bending" or "warping" of the "electromagnetic continuum", and oscillations of the "electromagnetic continuum" would be electromagnetic waves/radiation.

A good way of phrasing my question with that use of vocabulary would be: "Does an electromagnetic continuum exist, or is it all just separate electromagnetic fields and oscillations?"

I hope these examples give enough information for you to understand my question(s) and correct any misconceptions I may have.

Thanks!

Answer

Typically we describe the electromagnetic field using the electromagnetic four-potential, which is represented as $\mathbf{A}$. From the four potential we derive the electric and magnetic fields using:

$$ \mathbf{E} = -\nabla \phi - \frac{\partial\mathbf{A}}{dt} $$

$$ \mathbf{B} = \nabla\times \mathbf{A} $$

where $\phi$ is the electric potential. It is very important to appreciate that the values we get for $\mathbf{E}$ and $\mathbf{B}$ are coordinate dependant, by which I mean that observers in different frames will measure different values for $\mathbf{E}$ and $\mathbf{B}$. You can see this very easily. Suppose in my frame I have a charge at rest, in which case that charge generates a static electric field and no magnetic field. If you are moving with respect to me then you observe a moving charge, i.e. a current, and currents generate magnetic fields.

So if you are considering an all encompassing field this would have to be the four potential. But the four-potential is not a physical object. It is a function of spacetime. If you feed a spacetime point into the function it will return a vector that describes the electric and magnetic potentials at that point.

In principle the four-potential is a function of the position and velocity of every charge present anywhere in the universe. In practice we usually find we can ignore distant charges so the four potential is a function of only finitely many charges:

$$ \mathbf{A} = f_A(q_0, v_0, q_1, v_1, ... q_n, v_n) $$

But the function $f_A$ can be broken up into a sum of separate functions for each charge:

$$ \mathbf{A} = f_0(q_0, v_0) + f_1(q_1, v_1) + ... + f_n(q_n, v_n) $$

and we could write this as:

$$ \mathbf{A} = \mathbf{A}_0 + \mathbf{A}_1 + ... + \mathbf{A}_n $$

where you can regard $\mathbf{A}_0$ as the four-potential of charge 0 and so on.

Now back to your question:

At every point in spacetime there is just one value for $\mathbf{A}$. The four potential cannot simultaneously have more than one value. However we can write that value as a sum of the four-potential of every charge.

So there is a sense in which there is a single electromagnetic four-potential, but there is also a sense in which there are lots of individual electromagnetic four-potentials that add up to give a single value at every point.

I think which of these interpretations you prefer is philosophy rather than physics. My view is that there is a single four-potential and it can be mathematically decomposed into individual components.

Wednesday 23 August 2017

particle physics - Nuclear Instability and Q value in Alpha decay

Alpha decay: $(Z,A) \rightarrow (Z-2,A-4)+ ^4_2He $

According to the book: "Nuclear and particle physics" by Williams, $Q_\alpha$ is the measure of available energy to permit an alpha decay.

It is defined as:

$$ Q_\alpha = M(Z,A)-M(Z-2,A-4)-M(2,4)$$

(in natural units)

where M is the nuclear mass, defined as: $M(Z,A) = Zm_p+Am_n-BE(Z,A) $

where BE is the binding energy.

Since the number of protons and neutrons doesn't change (in this case), $Q_\alpha$ can be written as:

$$ Q_\alpha = BE_{f}-BE_{i}=\Delta BE$$

If $Q_\alpha >0$ then the reaction is possible.

I am having difficulties to interpret the $Q_\alpha$-value in alpha decays. I will explain my reasoning hoping that someone can point out my mistake.

My reasoning:

If $Q_\alpha = \Delta BE > 0$ then it means that the energy of the system increased, and therefore energy from somewhere would be needed to make the reaction possible. Then $Q_\alpha $ wouldn't be the energy available for the reaction to happen, but the extra energy needed for the reaction to happen.

Answer

..... it means that the energy of the system increased

is not a correct statement as in a decay the (potential) energy of the system decreases.
You can think of it as a reduction in the potential energy of the system resulting in an increase in the kinetic energy of the system.

When the constituent parts of the parent nucleus $(A,Z)$ come together a certain amount of energy is released $Q_{\rm parent}$ and so the parent nucleus is at a lower energy state than its constituent parts.

When the constituent parts of the daughter nucleus $(A-4,Z-2)$ come together a certain amount of energy is released $Q_{\rm daughter}$ and so the daughter nucleus is at a lower energy state than its constituent parts.

When the constituent parts of the alpha particle $(4,2)$ come together a certain amount of energy is released $Q_{\rm alpha}$ and so the alpha particle is at a lower energy state than its constituent parts.

Now $Q_{\rm alpha}+ Q_{\rm daughter} > Q_{\rm parent}$ ie the alpha particle and daughter nucleus together are in a lower energy state than the parent nucleus and the Q-value of the decay is given by

$$Q_{\rm decay}=Q_{\rm alpha}+ Q_{\rm daughter} - Q_{\rm parent} $$.

which is a positive quantity, ie energy is released as a result of the decay and manifests itself as the kinetic energy of the alpha particle and the kinetic energy of the daughter nucleus.

Put another way, it takes more energy to break up a daughter nucleus and an alpha particle into their constituent parts than the energy required to break up the parent nucleus into its constituent parts and the difference in energy is the Q-value of the decay.

quantum electrodynamics - How many photons can an electron absorb and why?

How many photons can an electron absorb and why?

Can all fundamental particles that can absorb photons absorb the same amount of photons and why?

If we increase the velocity of a fundamental particle, do we then increase the amount of photons it can emit?

At constant temperature and mass of an isolated fundamental particle that does not and will not move (constant speed and 0 vibration ), is emitting a photon the only way to loose energy?

Answer

It would be useful if your profile gave an indication of educational background in physics or even of age.

I would recommend browsing through the CERN teaching resources.

How many photons can an electron absorb and why ?

Reminds of "how many angels can dance on the tip of a needle". :)

If you read the links provided you will understand that an elementary particle does not absorb a photon, it can interact with a photon and the result can be variable, but there will always be two particles in and two particles out, because of momentum conservation. The possible results of a photon interacting with an electron are drawn as Feynman diagrams. The same electron in its trajectory can interact with an unlimited number of photons.

Can all fundamental particles that can absorb photons absorb the same amount of photons and why ?

Particles interact, and do not absorb. And the interaction with photons will depend on the coupling constants in the Feynman diagrams. If a particle has no charge its probability of interacting with a photon is very low, through higher order diagrams, so no, not all particles interact with the same probability with a photon.

If we increase the velocity of a fundamental particle , do we then increase the amount of photons it can emit ?

A charged fundamental particle interacting with an electric or magnetic field can emit photons ( bremsstrahlung or synchrotron radiation). The higher the energy of the particle the higher the probability of a bremsstrahlung photon emited, so yes, the higher the energy the more photons from a charged particle accelerating in a magnetic or electric field. If the velocity is steady there is no emission.

At constant temperature and mass of an isolated fundamental particle that does not and will not move (constant speed and 0 vibration ) , is emitting a photon the only way to loose energy ?

Temperature has no meaning at the particle level. It is the kinetic energy of the particle in question. A particle can only loose energy via interactions with other particles/fields. Unless it interacts it does not lose any energy.

Edit for Clarification:

The above addresses the naive question on the number of photons an electron can absorb, and that is zero for free electrons and continuum photons: there is interaction and not absorption.

Electrons which are bound in atoms or molecules (or even crystals) are in a quantized state of energy. A photon with the appropriate energy can kick up to a higher quantized level the electron and then it will be absorbed/disappear. In this case, of a potential well, one photon can be absorbed by the system electron-in-potential-well at a time. There could be a second appropriate energy photon which could kick it up again but the times this can happen are countable, and finally the electron will be free and the atom ionized. Usually the electron cascades down to the lower level emitting maybe more photons of lower energy as it falls. A specific bound electron can help in the absorption of a photon by the system a limited number of times.

temperature - Can UV light make us invisible?

For an object to create different EM waves, it needs to increase the temperature, so what if we or some material could be so hot, that it would emit ultraviolet light, and thanks to that be invisible for the human eye.

I have a lot of questions about this and I would like you to answer me.

At which temperature does an object emit UV light?

If an object emits UV light, we wouldn't see anything or we would see some type of violet light?

Is there any material that can get to that temperature without melting?

Is there any powerful insulator?

Thanks for reading this, and for breaking my dream of making someone invisible, I invite you to day-dream and imagine stupid questions.

Tuesday 22 August 2017

particle physics - What's the symbol for the antiparticle of the $Delta^+$ baryon?

It can't be $\Delta^-$ since that is another particle also made up of quarks (not antiquarks). I can think of four possibilities:

$\overline\Delta^+$

$\overline{\Delta^+}$

$\overline\Delta^-$

$\overline{\Delta^-}$

I am sure someone has asked a similar question, but I failed at searching for it.

Answer

The $\Delta$ is a quartet of particles with isospin 3/2: $$ \Delta^-, \Delta^0, \Delta^+, \Delta^{++} $$ I would expect the anti-$\Delta$ to be written $\bar\Delta$, with the four isospin projections $$ \bar\Delta^{--}, \bar\Delta^-, \bar\Delta^0, \bar\Delta^+ $$ In this case the antiparticle of the $\Delta^+$ would be the $\bar\Delta^-$.

If you'd like a canonical reference, look for a paper about pion production with antiproton beams. Note that there's a whole spectrum of $\Delta$ resonances, not just the lightest one at 1232 MeV.

thermodynamics - If the Big Bang started in a state of low entropy then shouldn't the Big Crunch be in a state of low entropy also?

I recently read that:

What our current level of thermodynamics does tell us though is that if there is such a thing as a Big Crunch, the universe will look a lot different as it heads towards collapse than it did when expanding – the universe started in a state of low entropy and high order and will end in a state of high entropy and low order.

But shouldn't the order be in a high state and hence entropy becomes a low state in the Big Crunch (assuming a closed universe)?

The reason I write this is because the gravitational forces in a closed universe are so strong they will cause gravitational collapse so the universe will shrink and go backwards, eventually through all the stages of the big bang and become a singularity once more.

It was my understanding that a singularity has a state of high order and low entropy. Have I got this the wrong way round?

If so; Can someone please explain why?

Answer

Exactly at a singularity, the physics is by definition not well-defined and we cannot assign it a definite entropy. However, we can discuss the region very close to a singularity (i.e. just after the Big Bang or just before the Big Crunch). A priori, this region can have either high or low entropy - there's no rule saying that things always have to have low entropy near a singularity. Indeed, we believe that just after the Big Bang, the universe had very low entropy, while just before the Big Crunch it would have very high entropy - just as your source says.

(You may be thinking that the entropy must be low after the Big Bang because everything is "near the same place." But this is wrong because (a) that's only taking into account the positional entropy, not the entropy of momenta, and (b) neither the Big Bang nor the Big Crunch did/will occur at any one particular location in space.)

The reason that the universe would have very high entropy just before the Big Crunch is just plain old thermodynamics - in an ergodic system, high entropy states are overwhelmingly more likely than low-entropy ones just by a basic counting argument. The reason why the universe had very low entropy right after the Big Bang is much less well-understood, although cosmic inflation may have played a role. The only thing we know for sure is that the universe couldn't have been anywhere close to maximally entropic right after the Big Bang, or we wouldn't be here having this discussion! (Well, the Boltzmann brain supporters think that we aren't having this discussion, and I'm just a momentary figment of your imagination. But trust me, I'm real. Would I lie to you?)

density - Does an object need fluid under it to float?

Suppose I have a wooden block that floats in water and I put it in an empty glass beaker. Then I poured water in the beaker so carefully that no water goes under the block. In that case, would the block float or it will remain under the water?

As there is no water under the block there is no upward thrust on the base. Therefore it seems that the block would not float.

On contrary, ~~there is tangential upward thrust on the sides of the block.~~ Because of this thrust the block will float.

Which is the correct answer?

EDIT

I think the right answer is : the block would not float. Thanks to sammygerbil, anna v, Kieran Moynihan.

This reminds me another question.

A wooden block is floating on water in a air tight container. Is it going to float more or float less in the following cases?

All the air has been pulled out from the container.

We compress the air above the water.

My Thought

Case 1:

When all the air has been pulled out from the container there is no buoyant force due to air, so total downward force acting on the block would increase and the block would float less.

OR,

As there is no air below the block no buyout force due to air. When all the air has been pulled out from the container there is no air pressure, so the block would float more.

Case 2:

When we compress the air, the density of air would increase. So the weight of the displaced air as well as the buoyant force due to air would also increase and the block would float more.

OR,

As there is no air below the block no buyout force due to air. When we compress the air, so the block would float less as air puts more downward pressure on the block.

Now which is correct or incorrect and why?

Answer

IN ANSWER TO THE ORIGINAL QUESTION

Yes, there must be fluid vertically below some part of the object for it to float, but there need not be fluid below all parts of it. The object must have some surface area which has an inward normal with an upward component, so that the water pressure on it has an upward component. For example, the block could float if the sides slope inwards toward the base, as Kieran Moynihan suggests.

In his answer to No buoyancy inside liquid Chester Miller shows that the resultant buoyant force of fluid pressure ("upthrust") on an object which is touching the bottom of a container is $$B=(V-hA)\rho_wg$$ where $V$ is the volume of the submerged part of the object, $h$ is the depth of fluid and $A$ is the area of contact with the bottom of the container.

If the sides of the block are vertical then $hA \ge V$ for all values of $V$. (The equality applies if the object is not fully submerged.) The formula confirms anna v's observation that no amount of water will make the block float, even if it is far less dense than water.

This formula also shows that even if $A$ is very small you can make $B$ negative (ie the "buoyant" force becomes a downward force) by increasing the depth $h$ of fluid. The surprising consequence is that an object which would otherwise float when free of the bottom of the container (ie when there is fluid below all parts of it) can be made to stay on the bottom if a small part of it is already touching the bottom. For example, a large helium balloon can be tethered to the ground using a light suction cup which is much smaller in area than the cross-section of the balloon.

IN ANSWER TO THE QUESTION IN YOUR EDIT

Intuitively you would think that increasing the pressure of the air increases the downward force on the block, making it sink lower in the water, but this is not the case. In both cases what happens to the block does not depend on the pressure of the air but the pressure gradient in the air. If the pressure is uniform throughout the air space, an increase or decrease has no effect on the depth at which the object floats in the water. But if there is a pressure gradient (which necessarily increases downwards) then an increase in the average pressure makes the object rise up in the water, and a decrease makes it sink lower.

Explanation

The explanation is similar to that in Why does a helium filled ballon move forward in a car when the car is accelerating?

The forces on the block are initially balanced. The vertical forces are the weight $W$ of the block and the pressure-forces $F_1$ of the air on the upper face and $F_2$ of the water on the lower face of the block : $$F_2=W+F_1$$ To avoid complications I assume that the block is cuboid so that the areas of upper and lower faces are equal.

Suppose the air pressure is constant throughout the upper part of the container. Then an increase in air pressure increases the forces $F_1, F_2$ equally, so the depth at which the block floats in the water does not change. The increase in pressure at the upper face is transmitted through the air and water to the lower face, increasing it by the same amount.

The air pressure would be approximately constant throughout its volume if the air is only slightly compressible and its density is low compared with that of the water. Both these conditions usually apply at typical atmospheric pressures.

However, if there is a significant pressure gradient in the air then the pressure at the surface of the water will be greater than at the upper face of the block. It is the pressure at the water surface which is transmitted to the lower face of the block, so the increase in force on the lower face would be greater than that on the upper face, and the block would rise up in the water.

Another way of seeing this is to imagine that the air becomes as dense as the water. Then since the block floats in water it will also float upwards into the dense air.

There will be a significant pressure gradient in the air if either it is compressible or its density is comparable with that of the water.

Monday 21 August 2017

classical mechanics - Generalized definitions of Lagrangian and Hamiltonian functions

When we enter into the scope of Analytical mechanics we usually start with these two primary notions: Lagrangian function & Hamiltonian function

And usually textbooks define Lagrangian as $L=T-V$ and Hamiltonian as $H=T+V$ where $T$ is Kinetic energy and $V$ is Potential energy. But as we proceed it turns out Lagrangian and Hamiltonian may not always have these values and this occurs in just some special cases and there is a more general definition.

My question is:

What is the generalized definitions of these two functions?

Is there a general defining equation where the mentioned values could be extracted as a special case?

EDIT: A third question

Do we know about the notions of "Kinetic energy" and "Potential energy" beforehand or are they defined after the Lagrangian gets its famous form $L=T-V$ and then we assign $T$ and $V$ their respective definitions?

EDIT 2: A fourth question

What if there were non-conservative fields? Then obviously $V$ term couldn't show their effects(since Potentials are always associated with conservative fields). Then how could we bring the effects of non-conservative fields into play?

electricity - Why do Generators generate inconsistent voltage?

I'm ultimately trying to understand the function of resistors in a circuit. The way I see it, a small increase in voltage could result in a high increase in current that would "burn out" an LED if there is no resistor to ensure that the correct amount of current reaches the LED. So what causes a Generator to generate inconsistent voltage that would call for a resistor in the first place?

Answer

It's not merely “a small increase:” any voltage operating over zero resistance drives an infinite current.

Nature abhors an inifinity, so in practice you drive things out of their predictable regimes. Without the resistors there you may start to notice all of the ways that your other components fail to be ideal, like your power supply’s input impedance or the resistivity in your wires. These are hard to predict and hard to control over time. Other components like inductors and capacitors can also absorb the infinities, but they have their own complications—in particular, LC-circuits are prototypical oscillators and you might have trouble building systems in a quiet, predictable state.

In fact, resistors cannot smooth out irregularities in power supply voltage: they transfer it directly to irregularities in current. If you want to smooth that out in a DC scenario, you want a “low pass filter:” you either want a capacitor in parallel with your load, so that dips in the power supply’s voltage are temporarily supplemented with the capacitor’s voltage while peaks are greedily absorbed by the capacitor first, or an inductor in series with the load, which does the same vis-a-vis the current. If you want to target a specific cutoff frequency for such a filter, resistors also become indispensable for that.

We need resistors as voltage dividers and to take a fixed voltage difference and convert it to a predictable low current independent of the details of the wiring and the power supply. We need them to damp out oscillators that we create with inductors and capacitors. But we don't use them to make our power supplies more uniform, because they cannot store energy, they can only dissipate it.

optics - Virtual vs Real image

I'm doing magnification and lens in class currently, and I really don't get why virtual and real images are called what they are.

A virtual image occurs the object is less than the focal length of the lens from the lens, and a real image occurs when an object is further than focal length.

By why virtual and real? What's the difference? You can't touch an image no matter what it's called, because it's just light.

Answer

You can project a real image onto a screen or wall, and everybody in the room can look at it. A virtual image can only be seen by looking into the optics and can not be projected.

As a concrete example, you can project a view of the other side of the room using a convex lens, and can not do so with a concave lens.

I'll steal some image from Wikipedia to help here:

First consider the line optics of real images (from http://en.wikipedia.org/wiki/Real_image):

real images formed by a single convex lens of concave mirror

Notice that the lines that converge to form the image point are all drawn solid. This means that there are actual rays, composed of photon originating at the source objects. If you put a screen in the focal plane, light reflected from the object will converge on the screen and you'll get a luminous image (as in a cinema or a overhead projector).

Next examine the situation for virtual images (from http://en.wikipedia.org/wiki/Virtual_image):

virtual images formed by a single concave lens or convex mirror

Notice here that the image is formed by a one or more dashed lines (possibly with some solid lines). The dashed lines are draw off the back of solid lines and represent the apparent path of light rays from the image to the optical surface, but no light from the object ever moves along those paths. This light energy from the object is dispersed, not collected and can not be projected onto a screen. There is still a "image" there, because those dispersed rays all appear to be coming from the image. Thus, a suitable detector (like your eye) can "see" the image, but it can not be projected onto a screen.

physical chemistry - Why is quicksilver (mercury) liquid at room temperature?

This is a nice question when you find it out, and I am really looking for a proper answer.

Take quicksilver (Hg) in the periodic table. It has one proton more than Gold (melting point 1337.33 K), and one less than Thallium (melting point 577 K). It belongs to the same group as Zinc (692.68 K) and Cadmium (594.22 K). All not very high melting points, but still dramatically higher than quicksilver (234.32 K). When his neighbors melt, quicksilver vaporizes (at 629.88 K).

What is the reason for this exceptional behavior of quicksilver ?

Answer

Although the question has been partially answered, there is a superb reference on this topic which will certainly give you some of the deep, and not so deep insights needed to understand the answer to this question.

http://pubs.acs.org/doi/abs/10.1021/ed068p110

Nevertheless, both the contraction of the s(1/2) orbitals predicted by the Dirac equation, and the filled valence shell of Hg are the major causes for the odd physical properties of Hg. However, there are other effects that should be considered and the paper above is pretty clear on those.

Let me reformulate in order to make things clearer and more specific:

The reason for liquid Hg can be stated simply: the outer electrons of Hg (6s2) that participate in metallic bonds are "less available" to bonding (which might be observed for example by looking at binding energies of clusters of Hg, dimers, etc) then in other common metals, and hence the interaction between Hg atoms is much weaker compared to other metal-metal bonds. The explanation for the "less availability" of the 6s electrons is the contraction of the 6s orbitals, caused by the high speeds achieved by those electrons. This effect is promptly predicted by the Dirac equation.

Now, one would ask: what about Au? It has a 6s electron and it is a very stable solid metal, which means it forms strong metallic bonds with other gold atoms. The question then turns to be, what makes metallic bonds strong in gold and weak in Hg? And, here comes the indispensable valence shell argument: gold has only one electron in its 6s orbital, while Hg has two electrons in the 6s orbital. It turns out that this creates a large difference in bonding: we can use a simple qualitative molecular orbital argument to understand why. Let's imagine a simple picture of the bonding between two Au atoms, such that only the 6s electron of each atom contributes to the bonding. That being true, 2 MOs are formed in the process of bonding, and the 6s electron from each atom occupies the bonding orbital, while the antibonding MO is unoccupied. The bonding MO has a lower energy than the 6s orbitals, which stabilizes the dimer, while the antibonding has a higher energy than the 6s orbitals, making it unfavorable for a bond to form. For Au, as we just saw the 6s electrons would be in the bonding orbital and the metallic bond would be favorable. Now, this scheme is easily extendable for Hg, and the surprise here is that since the 6s of Hg has 2 electrons, in order for Hg to dimerize, 2 electrons from our simplified model would go to the bonding molecular orbital, while the other two would be in the antibonding molecular orbital. Therefore, the stabilizing effect generated by electrons in the bonding orbitals is now circumvented by the destabilization provoked by the two electrons in antibonding molecular orbitals. In these circumstances, the metallic bond would be unfavorable and the interaction between the Hg atoms will be weak, i.e., dispersion will probably dominate. By scaling these arguments to a much larger number of Hg atoms we can see that at room temperature solid Hg would not be stable, just like a bunch of other systems in which the constituents interact mainly via van der Waals forces.

A question that might arise from the analysis above is why cadmium or zinc which have 5s and 4s orbitals filled are solids at room temperature, and the reason is that in this case the s orbitals do not suffer appreciable relativistic contraction, and therefore it is not unfavorable for these atoms to generate solids in which those electrons participate in metallic bonds. We can note that Hg(2+)-Hg(2+) is a quite stable ion in which the Hg(2+) also have filled shells (though the 6s orbitals are unoccupied now). The reason for this last case is that the most external valence subshell is the 5d, with 10 electrons. This subshell is much more diffuse than the 6s shell and the relativistic effects are opposite, i.e., the d orbitals are extended compare to the non-relativistic case, and hence the electrons would be more easily engaged in interionic bonding, even if they are held by dispersion forces only, since the polarizability of Hg2+ is much larger compared to Hg.

The arguments above were derived from a simplified model, but others more complex based on those might be easily derived. I know that there are weak spots, but as noted in my comments, experimental facts and calculations converge with those explanations, and the paper referenced above shows part of this in more depth, and even explains other interesting relativistic effects such as the "inert pair effect" that general chemistry professors love to talk about, but in most of the times, do not know what it really is.