Saturday, 31 October 2020

quantum mechanics - Boundary Conditions in a Step Potential


I'm trying to solve problem 2.35 in Griffith's Introduction to Quantum Mechanics (2nd edition), but it left me rather confused, so I hope you can help me to understand this a little bit better.


The aim of the problem is to find the probability that a particle with kinetic energy $E>0$ will reflect when it approaches a potential drop $V_0$ (a step potential).


I started with putting up the Schrödinger equations before and after the potential drop: $x<0: V(x)=0$ and $x>0: V(x)=-V_0$.


$\psi''+k^2\psi=0, x<0$


$\psi''+\mu^2\psi=0, x>0$


where $k=\sqrt{2mE}/\hbar$ and $\mu=\sqrt{2m(E+V_0)}/\hbar$



This would give me the general solutions


$\psi(x)=Ae^{ikx}+Be^{-ikx}, x<0$


$\psi(x)=Fe^{i\mu x}+Ge^{-i\mu x},x>0$


Now, I resonate that in order to have a physically admissable solution B=0 since the second term blows up when $x$ goes to $-\infty$ and F=0 since the first term in the second row blows up when $x$ goes to $\infty$. This would leave us with the solutions


$\psi(x)=Ae^{ikx}, x<0$


$\psi(x)=Ge^{-i\mu x},x>0$


which I then could use boundary conditions to solve. However, I realise that this is wrong since I need $B$ to calculate the refection probability. In the solution to this book they get the following general solutions (they don't say how the got them though).


$\psi(x)=Ae^{ikx}+Be^{-ikx}, x<0$


$\psi(x)=Fe^{i\mu x},x>0$


This is not very well explained in the book so I would really appriciate if someone could explain how to decide what parts of the general solutions that I should remove in order to get the correct general solution for a specific problem.




Answer



$e^{-i k x}$ does not blow up as $x \rightarrow -\infty.$ You're thinking in terms of real exponentials, but this is a complex exponential. That is, as long as $k$ is real we have:



  1. $\lim\limits_{x \rightarrow -\infty} e^{- k x} = \infty$

  2. $\lim\limits_{x \rightarrow -\infty} e^{- i k x}$ does not exist (since $e^{- i k x} = \cos{kx} - i \sin{kx}$).


So that explains why the $B$ term is still there in the solution.


The reason that Griffiths discounts the $G$ term is because it represents a reflected wave traveling from the positive $x$ direction.


Think about it this way:


The problem at hand is of a particle coming from the $-x$ direction and encountering a sudden potential drop. The particle arriving gives rise to the term $A e^{i k x}$, as this is a traveling wave moving in the $+x$ direction.



When the particle encounters the barrier, it can either reflect (giving rise to the term $B e^{- i k x}$) or transmit (giving rise to the term $F e^{i \mu x}$).


Note that there is no circumstance under which a particle could be coming from the $+x$ direction in the region to the right of the potential drop, which is what the term $G e^{- i \mu x}$ would imply. A particle comes from the left, and then either moves forward to the right or reflects back to the left. A situation under which the particle travels towards the potential drop from the $+x$ direction is not physically admissible under this circumstance.


So we drop the term $G e^{- i \mu x}$, and we're left with


$\psi(x) = \begin{cases} A e^{i k x} + B e^{- i k x}, & x < 0 \\ F e^{i \mu x}, & x > 0 \end{cases}$


as described in the book.


quantum field theory - Why is baryon or lepton violation in standard model is a non-perturbative effect?


The baryon number B or lepton number L violation in the standard model arise from triangle anomaly. Right? Triangle diagrams are perturbative diagrams. Then why the B or L violation in Standard model is said to be a non-perturbative effect? I'm confused.



Answer



It is a non-perturbative effect because it is 1-loop exact.


The triangle diagram is actually the least insightful method to think about this, in my opinion. The core of the matter is the anomaly of the chiral symmetry, which you can also, for example, calculate by the Fujikawa method examining the change of the path integral measure under the chiral transformation. You can obtain quite directly that the anomaly is proportional to


$$\int \mathrm{Tr} (F \wedge F)$$


which is manifestly a global, topological term, (modulo some intricacies) it is the so-called second Chern class and takes only values of $8\pi^2k$ for integer $k$. It is, by the Atiyah-Singer index theorem (this can also be seen by Fujikawa), essentially the difference between positive and negative chiral zero modes of the Dirac operator. This is obviously a discontinuous function of $A$ (or $F$), which is already bad for something which, if it were perturbative, should be a smooth correction to something, and it is also the number describing which instanton vacuum sector we are in, see my answer here. Since perturbation theory takes place around a fixed vacuum, this is not a perturbative effect, since it is effectively describing a tunneling between two different vacuum sectors.


newtonian mechanics - Throwing a ball upwards in an accelerating train


If I throw a ball upwards to a certain height in an accelerating train, will it end up in my hand? At the moment I release the ball, it will have a velocity equal to that of the train at that instant. But because the train is accelerating, the train will have a greater velocity and thus the ball will travel lesser distance and should fall behind me. Is my reasoning correct?



Answer



User Sahil Chadha has already answered the question, but here's the math and a pretty picture for anyone who is unconvinced that you're right.


Since the train is accelerating, from the perspective of an observer on the train, the ball will experience a (fictitious) force in the direction opposite the train's travel having magnitude $ma$ where $m$ is the mass of the ball and $a$ is the magnitude of the acceleration of the train. If we call the direction of travel the positive $x$-direction, and if we call the "up" direction the positive $y$-direction, then the equations of motion in the $x$- and $y$-directions will therefore be as follows: \begin{align} \ddot x &= -a \\ \ddot y &= -g. \end{align} The general solution is \begin{align} x(t) &= x_0 + v_{x,0} t - \frac{1}{2}a t^2 \\ y(t) &= y_0 + v_{y,0} t - \frac{1}{2}g t^2 \end{align} Now, let's say that the origin of our coordinate system lies at the point from which the ball is thrown so that $x_0 = y_0 = 0$ and that the ball is thrown up at time $t=0$ with velocity $v_{y,0} = v$ and $v_{x,0} = 0$ in the positive $y$-direction, then the solutions becomes \begin{align} x(t) &= -\frac{1}{2}a t^2\\ y(t) &= vt - \frac{1}{2} gt^2 \end{align} So what does this trajectory look like? By solving the first equation for $t$, and plugging this back into the equation for $y$, we obtain the following expression for the $y$ coordinate of the particle as a function of its $x$ coordinate along the trajectory: \begin{align} y(x) = v\sqrt{-\frac{2x}{a}} +\frac{g}{a} x \end{align} Here's a Mathematica plot of what this trajectory looks like for $v = 1.0\,\mathrm m/\mathrm s$ and the list $a = 9.8,5.0,2.5,1,0.1\,\mathrm m/\mathrm s^2$ of values for the train's acceleration


enter image description here


From the point of view of someone on the train, the ball flies backward in a sort of deformed parabola, but the smaller the acceleration is, the more it simply looks as it would if you were to throw a ball vertically in an un-accelerated train.


Friday, 30 October 2020

supersymmetry - Basic Grassmann/Berezin Integral Question


Is there a reason why $\int\! d\theta~\theta = 1$ for a Grassmann integral? Books give arguments for $\int\! d\theta = 0$ which I can follow, but not for the former one.




electromagnetism - Why do waves diffract?


There have already been a lot of questions on this site on diffraction but I still believe this one might be slightly different. In electromagnetic waves, diffraction and any other phenomenon of wave propagation can be solved by Huygen's principle, a geometric construction that asks us to consider all points on a wavefront as secondary sources of wavefront.
A justification of this treatment is provided by Feynman:-
enter image description here enter image description here




I understood this completely. But then diffraction is a very general phenomenon. What if we talk about mechanical waves (sound waves) then this treatment by superposing the fields produced by opaque and a hypothetical plug is no longer valid, but the diffraction details are similar. Why then on an intuitive level is there diffraction, details of which can be worked out by considering secondary sources?




Answer



I'd likely to further dmckee's answer by answering the OP's follow-up question:



Can you please explain why considering secondary sources as in Huygen's principle is justified, i.e., why do we get correct results by assuming secondary sources when there are actually no sources other than the original source? As Feynman explains in case of electromagnetic waves, it is because the diffracted wave is equivalent to the superposition of electric fields of a hypothetical plug containing several independent sources.



Huygen's principle is actually a fairly fundamental property of solutions of the Helmholtz equation $(\nabla^2 + k^2)\psi = 0$ or the D'Alembert wave equation $(c^2\nabla^2 - \partial_t^2)\psi = 0$. For these equations the Green's function is a spherical wave diverging from the source. All "physically reasonable" solutions (given reasonable physical assumptions such as the Sommerfeld Radiation Condition) in freespace regions away from the sources can be built up by linear superposition from a system of these sources outside the region under consideration. Already this is sounding like Huygen's principle, but one can go further and, with this prototypical solution and the linear superposition principle together with Gauss's divergence theorem, show that waves can be approximately thought of as arising from a distributed set of these "building block" spherical sources spread over the wavefront: this result leads to the Kirchoff Diffraction Integral, thence to various statements of Huygens's Principle.


This treatment is worked through in detail §8.3 and §8.4 of Born and Wolf, "Principles of Optics" or in Hecht, "Optics", which I don't have before me at the moment.



electromagnetism - Can a wire having a $610$-$670$ THz (frequency of blue light) AC frequency supply, generate blue light?


We know that when we give alternating current across a wire then it will generate an electromagnetic wave which propagates outward.



But if we have a supply which can generate 610 to 670 terahertz of alternating current supply then does the wire generate blue light?




thermodynamics - Why is there no absolute maximum temperature?



If temperature makes particles vibrate faster, and movement is limited by the speed of light, then temperature must be limited as well I would assume. Why there is no limits?



Answer



I think the problem here is that you're being vague about the limits Special Relativity impose. Let's get this clarified by being a bit more precise.


The velocity of any particle is of course limited by the speed of light c. However, the theory of Special Relativity does not imply any limit on energy. In fact, as energy of a massive particle tends towards infinity, its velocity tends toward the speed of light. Specifically,


$$E = \text{rest mass energy} + \text{kinetic energy} = \gamma mc^2$$


where $\gamma = 1/\sqrt{1-(u/c)^2}$. Clearly, for any energy and thus any gamma, $u$ is still bounded from above by $c$.


We know that microscopic (internal) energy relates to macroscopic temperature by a constant factor (on the order of the Boltzmann constant), hence temperature of particles, like energy, has no real limit.


quantum mechanics - Definition of symmetrically ordered operator for multi-mode case?


As I know, Wigner function is useful for evaluating the expectation value of an operator. But first you have to write it in a symmetrically ordered form. For example:


$$a^\dagger a = \frac{a^\dagger a + a a^\dagger -1}{2}$$


For single mode case where there is only one pair of creation and destroy operator the symmetrically ordered operator is defined. But for multi-mode case,how is it defined? For example, how would we write $$a_1^\dagger a_1 a_2^\dagger a_2$$ in a symmetrically ordered form (such that we could easily evaluate its expectation value using Wigner function)?




Answer



OK, I assume you are comfortable with the rules of Weyl symmetrization where you may parlay the $[x,p]=i\hbar$ commutation relation to the $[a,a^\dagger]=1$ one... the combinatorics is identical provided you keep track of the is and the ħs, etc...


So your surmise is sound. Since different modes commute with each other, they don't know about each other, and you Weyl-symmetrize each mode factor separately.


So, e.g., for your example, $$ a_1^\dagger a_1 a_2^\dagger a_2= a_2^\dagger a_2 a_1^\dagger a_1 = \frac{a_1^\dagger a_1 + a_1 a_1^\dagger -1}{2} ~ \frac{a_2^\dagger a_2 + a_2 a_2^\dagger -1}{2}, $$ etc. All you need to recall is that each mode is in a separate (Fock) space of a tensor product, so all operations factor out into a tensor product of Weyl-symmetrized factors.


This is the reason all multimode/higher-dim phase space generalizations of all these distributions functions are essentially trivial, and most subtleties are routinely illustrated through just one mode.




Edit in response to comment on entangled states. One of the co-inventors of the industry, Groenewold, in his monumental 1946 paper, Section 5.06 on p 459, details exactly how to handle entangled states--in his case for the EPR state. The entanglement and symmetrization is transparent at the level of phase-space parameters (Weyl symbols): the quantum operators in the Wigner map are still oblivious of different modes. What connects/entangles them, indirectly, are the symmetrized δ-function kernels involved, even though this is a can of worms that even stressed Bell's thinking. The clearest "modern" paper on the subject is Johansen 1997, which, through its factorized Wigner function and changed +/- coordinates, reassures you you never have to bother with the quantum operators: the entangling is all in the Wigner function and phase-space, instead! (Illustration: 351884/66086.)


newtonian mechanics - What is the difference between impulse and momentum?


What is the difference between impulse and momentum?


The question says it all...I know the second of of them is mass * velocity, but what is the first one for, and when is it used? Also, what are its units, and is there a symbol for it?



Answer



Given a system of particles, the impulse exerted on the system during a time interval $[t_a, t_b]$ is defined as $$ \mathbf J(t_1, t_2) = \int_{t_1}^{t_2} dt\,\mathbf F(t) $$ where $\mathbf F$ is the net external force on the system. Since one can show that the net external force on a system is given by Newton's second law by $$ \mathbf F(t) = \dot{\mathbf P}(t) $$ where $\mathbf P$ is the total momentum of the system, one has $$ \mathbf J(t_1, t_2) = \mathbf P(t_2) - \mathbf P(t_1) $$ In other words, the impulse is equal to the change in momentum of the system. The dimensions of these quantities are the same, namely mass times velocity.


You can think of impulse as kind of the "net effect" that a force has in changing the state of motion of a system. Here is an example to illustrate what I mean.


Imagine you're pushing a shopping cart. Let's say you push the cart with a constant force for a short period of time versus a long period of time. When you push it for a short period of time, then the integral of the force with respect to time will be smaller than when you push it for a long period of time, and the result will be that the cart's momentum will not change as much. However, if you were to push the cart for a short period of time, but if you were to push it very hard, then you could make up for the short period of time for which the force acts and still get the cart going fast.


The Upshot: The impulse takes into consideration both the effect of the force on the system, and the duration of time for which the force acts.


quantum field theory - What does it mean for a QFT to not be well-defined?


It is usually said that QED, for instance, is not a well-defined QFT. It has to be embedded or completed in order to make it consistent. Most of these arguments amount to using the renormalization group to extrapolate the coupling up to a huge scale, where it formally becomes infinite.


First of all, it seems bizarre to me to use the renormalization group to increase resolution (go to higher energies). The smearing in the RG procedure is irreversible, and the RG doesn't know about the non-universal parts of the theory which should be important at higher energies.


I have read in a note by Weinberg that Kallen was able to show that QED was sick using the spectral representation of the propagator. Essentially, he evaluated a small part of the spectral density and showed that it violated the inequalities for the field strength $Z$ that are imposed by unitarity + Lorentz invariance, but I do not have the reference.


So what I would like to know is :


What does it take to actually establish that a QFT is not well-defined?


In other words, if I write down some arbitrary Lagrangian, what test should I do to determine if the theory actually exists. Are asymptotically free theories the only well-defined ones?



My initial thought was: We would like to obtain QED (or some other ill-defined theory) using the RG on a deformation of some UV CFT. Perhaps there is no CFT with the correct operator content so that we obtain QED from the RG flow. However, I don't really know how to make that precise.


Any help is appreciated.



Answer



Usually if someone says a particular QFT is not well-defined, they mean that effective QFTs written in terms of these physical variables don't have a continuum limit. This doesn't mean that these field variables aren't useful for doing computations, but rather that the computations done using those physical variables can only be entirely meaningful as an effective description/approximation to a different computation done with a different set of variables. (Think of doing an expansion in a basis and only keeping terms that you know make large contributions to the thing you're trying to compute.)


A funny and circular way of saying this is that a QFT is well-defined if it can be an effective description of itself at all length scales.


You're right that it's slightly perverse to talk about renormalization flow in terms of increasing resolution. But there's a(n inversely) related story in the terms of decreasing resolution. Suppose you have a set of fields and a Lagrangian (on a lattice for concreteness, but you can cook up similar descriptions for other regularizations). You can try to write down the correlation functions and study their long distance limits. Your short distance QFT is not well defined if it only flows out to a free theory.


When people talk about having coefficients blow up, they're imagining that they've chosen a renormalization trajectory which involves the same basic set of fields at all distance scales. Flowing down to a non-interacting theory means that -- along this trajectory -- the interactions have to get stronger at shorter distance scales. Testing this in practice can be difficult, especially if you insist on a high degree of rigor in your arguments.


Not all well-defined theories are asymptotically free. Conformal field theories are certainly well-defined, but they can be interacting and these ones are not asymptotically free.


Thursday, 29 October 2020

general relativity - Positive Mass Theorem and Geodesic Deviation


This is a thought I had a while ago, and I was wondering if it was satisfactory as a physicist's proof of the positive mass theorem.


The positive mass theorem was proven by Schoen and Yau using complicated methods that don't work in 8 dimensions or more, and by Witten using other complicated methods that don't work for non-spin manifolds. Recently Choquet-Bruhat published a proof for all dimensions, which I did not read in detail.


To see that you can't get zero mass or negative mass, view the space-time in the ADM rest frame, and consider viewing the spacetime from a slowly accelerated frame going to the right. This introduces a Rindler horizon somewhere far to the left. As you continue accelerating, the whole thing falls into your horizon. If you like, you can imagine that the horizon is an enormous black hole far, far away from everything else.


The horizon starts out flat and far away before the thing falls in, and ends up flat and far away after. If the total mass is negative, it is easy to see that the total geodesic flow on the outer boundary brings area in, meaning that the horizon scrunched up a little bit. This is even easier to see if you have a black hole far away, it just gets smaller because it absorbed the negative mass. But this contradicts the area theorem.


There is an argument for the positive mass theorem in a recent paper by Penrose which is similar.


Questions:




  1. Does this argument prove positive mass?

  2. Does this mean that the positive mass theorem holds assuming only the weak energy condition?




quantum gravity - Do all the forces become one?




  1. Were the forces of nature combined in one unifying force at the time of the Big Bang?




  2. By which symmetry is this unification governed?





  3. Are there any evidence for such unification of forces?




  4. Has ever been published Theory or experiment in this issue? (Even original researches or unpublished theories. Anything that you can start with.)






acoustics - Why is the decibel scale logarithmic?


Could someone explain in simple terms (let's say, limited to a high school calculus vocabulary) why decibels are measured on a logarithmic scale?


(This isn't homework, just good old fashioned curiousity.)



Answer



I don't know anything about the history of the Bel and related measures.



Logarithmic scales--whether for audio intensities, Earthquake energies, astronomical brightnesses, etc--have two advantages:



  • You can look at phenomena over a wide ranges of scales with numbers that remain conveniently human-sized all the time. An earthquake you can barely detect and one that causes a regional disaster both fit between 1 and 10. Likewise the stillness of an audio-dead room and the pain of an amp turned up to 11 fit between 10 and 130.

  • Fractional measures are converted into differences which most people find easier to compute quickly. Three decibels reduction is always the same fractional difference; the EEs get a lot of mileage out of this.


These scales may seem very artificial at first, but if you use them they will become second nature.


quantum mechanics - How does non-commutativity lead to uncertainty?


I read that the non-commutativity of the quantum operators leads to the uncertainty principle.


What I don't understand is how both things hang together. Is it that when you measure one thing first and than the other you get a predictably different result than the measuring the other way round?


I know what non-commutativity means (even the minus operator is non-commutative) and I think I understand the uncertainty principle (when you measure one thing the measurement of the other thing is kind of blurred - and vice versa) - but I don't get the connection.


Perhaps you could give a very easy everyday example with non-commuting operators (like subtraction or division) and how this induces uncertainty and/or give an example with commuting operators (addition or multiplication) and show that there would be no uncertainty involved.



Answer



There is a fair amount of background mathematics to this question, so it will be a while before the punch line.


In quantum mechanics, we aren't working with numbers to represent the state of a system. Instead we use vectors. For the purpose of a simple introduction, you can think of a vector as a list of several numbers. Therefore, a number itself is a vector if we let the list length be one. If the list length is two, then $(.6, .8)$ is an example vector.



The operators aren't things like plus, minus, multiply, divide. Instead, they are functions; they take in one vector and put out another vector. Multiplication isn't an operator, but multiplication by two is. An operator acts on a vector. For example, if the operator "multiply by two" acts on the vector $(.6, .8)$, we get $(1.2, 1.6)$.


Commutativity is a property of two operators considered together. We cannot say "operator $A$ is non-commutative", because we're not comparing it to anything. Instead, we can say "operator $A$ and operator $B$ do not commute". This means that the order you apply them matters.


For example, let operator $A$ be "switch the two numbers in the list" and operator $B$ be "subtract the first one from the second". To see whether these operators commute, we take the general vector $(a,b)$ and apply the operators in different orders.


As an example of notation, if we apply operator $A$ to $(a,b)$, we get $(b,a)$. This can be written $A(a,b) = (b,a)$.


$$BA(a,b) = (b,a-b)$$


$$AB(a,b) = (b-a,a)$$


When we apply the operators in the different orders, we get a different result. Hence, they do not commute. The commutator of the operators is defined by


$$\textrm{commutator}(A,B) = [A,B] = AB - BA$$


This is a new operator. Its output for a given input vector is defined by taking the input vector, acting on it with $B$, then acting on the result with $A$, then going back to the original vector and doing the same in opposite order, then subtracting the second result from the first. If we apply this composite operator (to wit: the commutator) to $(a,b)$, we get (by subtraction using the two earlier results)


$$(AB - BA)(a,b) = (-a,b)$$



So the commutator of $A$ and $B$ is the operator that multiplies the first entry by minus one.


An eigenvector of an operator is a vector that is unchanged when acted on by that operator, except that the vector may be multiplied by a constant. Everything is an eigenvector of the operator "multiply by two". The eigenvectors of the switch operator $A$ are $\alpha(1,1)$ and $\beta(1,-1)$, with $\alpha$ and $\beta$ any numbers. For $(1,1)$, switching the entries does nothing, so the vector is unchanged. For $(1,-1)$, switching the entries multiplies by negative one. On the other hand if we switch the entries in $(.6,.8)$ to get $(.8,.6)$, the new vector and the old one are not multiples of each other, so this is not an eigenvector. The number that the eigenvector is multiplied by when acted on by the operator is called its eigenvalue. The eigenvalue of $(1,-1)$ is $-1$, at least when we're talking about the switching operator.


In quantum mechanics, there is uncertainty for a state that is not an eigenvector, and certainty for a state that is an eigenvector. The eigenvalue is the result of the physical measurement of the operator. For example, if the energy operator acts on a state (vector) with no uncertainty in the energy, we must find that that state is an eigenvector, and that its eigenvalue is the energy of the state. On the other hand, if we make an energy measurement when the system is not in an eigenvector state, we could get different possible results, and it is impossible to predict which one it will be. We will get an eigenvalue, but it's the eigenvalue of some other state, since our state isn't an eigenvector and doesn't even have an eigenvalue. Which eigenvalue we get is up to chance, although the probabilities can be calculated.


The uncertainty principle states roughly that non-commuting operators cannot both have zero uncertainty at the same time because there cannot be a vector that is an eigenvector of both operators. (Actually, we will see in a moment that is is not precisely correct, but it gets the gist of it. Really, operators whose commutators have a zero-dimensional null space cannot have a simultaneous eigenvector.)


The only eigenvector of the subtraction operator $B$ is $\gamma(0,1)$. Meanwhile, the only eigenvectors of the switch operator $A$ are $\alpha(1,1)$ and $\beta(1,-1)$. There are no vectors that are eigenvectors of both $A$ and $B$ at the same time (except the trivial $(0,0)$), so if $A$ and $B$ represented physical observables, we could not be certain of them both $A$ and $B$ at the same time. ($A$ and $B$ are not actually physical observables in QM, I just chose them as simple examples.)


We would like to see that this works in general - any time two operators do not commute (with certain restrictions), they do not have any simultaneous eigenvectors. We can prove it by contradiction.


Suppose $(a,b)$ is an eigenvector of $A$ and $B$. Then $A(a,b) = \lambda_a(a,b)$, with $\lambda_a$ the eigenvalue. A similar equation holds for $B$.


$$AB(a,b) = \lambda_a\lambda_b(a,b)$$


$$BA(a,b) = \lambda_b\lambda_a(a,b)$$


Because $\lambda_a$ and $\lambda_b$ are just numbers being multiplied, they commute, and the two values are the same. Thus



$$(AB-BA)(a,b) = (0,0)$$


So the commutator of $A$ and $B$ gives zero when it acts on their simultaneous eigenvector. Many commutators can't give zero when they act on a non-zero vector, though. (This is what it means to have a zero-dimensional null space, mentioned earlier.) For example, our switch and subtract operators had a commutator that simply multiplied the first number by minus one. Such a commutator can't give zero when it acts on anything that isn't zero already, so our example $A$ and $B$ can't have a simultaneous eigenvector, so they can't be certain at the same time, so there is an "uncertainty principle" for them.


If the commutator had been the zero operator, which turns everything into zero, then there's no problem. $(a,b)$ can be whatever it wants and still satisfy the above equation. If the commutator had been something that turns some vectors into the zero vector, those vectors would be candidates for zero-uncertainty states, but I can't think of any examples of this situation in real physics.


In quantum mechanics, the most famous example of the uncertainty principle is for the position and momentum operators. Their commutator is the identity - the operator that does nothing to states. (Actually it's the identity times $i \hbar$.) This clearly can't turn anything into zero, so position and momentum cannot both be certain at the same time. However, since their commutator multiplies by $\hbar$, a very small number compared to everyday things, the commutator can be considered to be almost zero for large, energetic objects. Therefore position and momentum can both be very nearly certain for everyday things.


On the other hand, the angular momentum and energy operators commute, so it is possible for both of these to be certain.


The most mathematically accessible non-commuting operators are the spin operators, represented by the Pauli spin matrices. These deal with vectors with only two entries. They are slightly more complicated than the $A$ and $B$ operators I described, but they do not require a complete course in the mathematics of quantum mechanics to explore.


In fact, the uncertainty principle says more than I've written here - I left parts out for simplicity. The uncertainty of a state can be quantified via the standard deviation of the probability distribution for various eigenvalues. The full uncertainty principle is usually stated


$$\Delta A \Delta B \geq \frac{1}{2}\mid \langle[A,B]\rangle \mid$$


where $\Delta A$ is the uncertainty in the result of a measurement in the observable associated with the operator $A$ and the brackets indicate finding an expectation value. If you would like some details on this, I wrote some notes a while ago that you can access here.


heat - Is fire plasma?


Is Fire a Plasma?



If not, what is it then?


If yes why, don't we teach kids this basic example?


enter image description here


UPDATE: I probably meant a regular commonplace fire of the usual temperature. That should simplify the answer.



Answer



Broadly speaking, fire is a fast exothermic oxidation reaction. The flame is composed of hot, glowing gases, much like a metal that is heated sufficiently that it begins to glow. The atoms in the flame are a vapor, which is why it has the characteristic wispy quality we associate with fire, as opposed to the more rigid structure we associate with hot metal.


Now, to be fair, it is possible for a fire to burn sufficiently hot that it can ionize atoms. However, when we talk about common examples of fire, such as a candle flame, a campfire, or something of that kind, we are not dealing with anything sufficiently energetic to ionize atoms. So, when it comes to using something as an example of a plasma for kids, I'm afraid fire wouldn't be an accurate choice.


How could we define speed without time?



This is a bit of a brain twister but it's a very serious question. Please be precise in your answer.


I do not believe time exists. I believe it is simply an illusion of our perception, the perception of seeing things around us change. The faster things change the more time seems to go by. In some sort, change is the result of the dissipation of energy. Thus what I believe is that what we are really observing is the energy being dissipated and created all around us and we call it time.


Since time doesn't exist (for me). That would mean that the equation of [ speed = distance / time ] is impossible. But nevertheless the concept of speed is fundamental for physics. What could we replace this equation with to no longer have speed dependent of time but of energy?



Answer



We know that to position objects in spacetime requires four coordinates e.g. $(t, x, y, z)$. So time certainly exists. The point you're addressing is about the flow of time. Incidentally this point is discussed in some detail in the question Is there a proof of existence of time?.



Any object traces out a worldline that is a curve in spacetime, and we can parameterise this curve by using an affine parameter $\tau$ that varies along the curve, then write the coordinates as a function of this parameter, $t(\tau)$, $x(\tau)$, $y(\tau)$ and $z(\tau)$. In fact this is exactly what is done in General Relativity. In GR the affine parameter is normally the proper time, but any parameter can be used and need not have any physical significance. For example photon world lines can be parameterised in this way even that the proper time is everywhere constant for a photon.


Having done this we can now calculate values for $dx/dt$, $dy/dt$, etc and even things like $dx/dy$ if we wish. Then we can define a coordinate velocity as:


$$ v^2 = \left( \frac{dx}{dt} \right)^2 + \left( \frac{dy}{dt} \right)^2 + \left( \frac{dz}{dt} \right)^2$$


So the point is that we can calculate a velocity without worrying about whether there is a flow of time in the sense humans normally use the term. There is no need to replace velocity with any other quantity.


estimation - Do you round uncertainties in intermediate calculations?


Let's say I have an experimental uncertainty of ±0.03134087786 and I perform many uncertainty calculations using this value. Should I round the uncertainty to 1 significant figure at this stage or leave it unrounded until my final answer?



Answer




tl;dr- No, rounding numbers introduces quantization error and should be avoided except in cases in which it's known known to not cause problems, e.g. in short calculations or quick estimations.





Rounding values introduces quantization error (e.g., round-off error). Controlling for the harmful effects of quantization error is a major topic in some fields, e.g. computational fluid dynamics (CFD), since it can cause problems like numerical instability. However, if you're just doing a quick estimation with a calculator or for a quick lab experiment, quantization error can be acceptable. But, to stress it – it's merely acceptable in some cases; it's never a good thing that we want.


This can be confusing because many intro-level classes teach the method of significant figures, which calls for rounding, as a basic method for tracking uncertainty. And in non-technical fields, there's often a rule that estimated values should be rounded, e.g. a best guess of "103 days" might be stated as "100 days". In both cases, the issue is that a reader might mistake the apparent precision of an estimate to imply a certainty that doesn't exist.


Such problems are purely communication issues; the math itself isn't served by such rounding. For example, if a best guess is truly "103 days", then presumably it'd be best to actually use that number rather than arbitrarily biasing it; sure, we might want to adjust an estimate up-or-down for other reasons, but making an intermediate value look pretty doesn't make any sense.


Getting digits back after rounding


Often, publications use a lot of rounding for largely cosmetic reasons. Sometimes these rounded values reflect an approximate level of precision; in others, they're almost arbitrarily selected to look pretty.


While these cosmetic reasons might make sense in a publication, if you're doing sensitive work based on another author's reported values, it can make sense to email them to request the additional digits or/and a finer qualification of their precision.


For example, if another researcher measures a value as "$1.235237$" and then publishes it as $``1.2"$ because their uncertainty is on-the-order-of $0.1$, then presumably the best guess one can make is that the "real" value is distributed around $1.235237$; using $1.2$ on the basis of it looking pretty doesn't make any sense.



Uncertainties aren't special values


The above explanations apply to not just a base measurement, but also to a measurement's uncertainty. The math doesn't care for a distinction between them.


So for grammatical reasons, it's common to write up an uncertainty like ${\pm}0.03134087786$ as $``{\pm}0.03"$; however, no one should be using ${\pm}0.03$ in any of their calculations unless they're just trying to do a quick estimate or otherwise aren't too concerned with accuracy.


In summary, no, intermediate values shouldn't be rounded. Rounding is best understood as a grammatical convention to make writing look pretty rather than being a mathematical tool.


Examples of places in which rounding is problematic


A general phenomena is loss of significance:



Loss of significance is an undesirable effect in calculations using finite-precision arithmetic such as floating-point arithmetic. It occurs when an operation on two numbers increases relative error substantially more than it increases absolute error, for example in subtracting two nearly equal numbers (known as catastrophic cancellation). The effect is that the number of significant digits in the result is reduced unacceptably. Ways to avoid this effect are studied in numerical analysis.


"Loss of significance", Wikipedia




The obvious workaround is then to increase precision when possible:



Workarounds


It is possible to do computations using an exact fractional representation of rational numbers and keep all significant digits, but this is often prohibitively slower than floating-point arithmetic. Furthermore, it usually only postpones the problem: What if the data are accurate to only ten digits? The same effect will occur.


One of the most important parts of numerical analysis is to avoid or minimize loss of significance in calculations. If the underlying problem is well-posed, there should be a stable algorithm for solving it.


"Loss of significance", Wikipedia



A specific example is in Gaussian elimination, which has a lot of precision-based problems:



One possible problem is numerical instability, caused by the possibility of dividing by very small numbers. If, for example, the leading coefficient of one of the rows is very close to zero, then to row reduce the matrix one would need to divide by that number so the leading coefficient is 1. This means any error that existed for the number which was close to zero would be amplified. Gaussian elimination is numerically stable for diagonally dominant or positive-definite matrices. For general matrices, Gaussian elimination is usually considered to be stable, when using partial pivoting, even though there are examples of stable matrices for which it is unstable.



"Gaussian elimination", Wikipedia [references omitted]



Besides simply increasing all of the values' precision, another workaround technique is pivoting:



Partial and complete pivoting


In partial pivoting, the algorithm selects the entry with largest absolute value from the column of the matrix that is currently being considered as the pivot element. Partial pivoting is generally sufficient to adequately reduce round-off error. However, for certain systems and algorithms, complete pivoting (or maximal pivoting) may be required for acceptable accuracy.


"Pivot element", Wikipedia



newtonian mechanics - Is the second law of thermodynamics a fundamental law, or does it emerge from other laws?


My question is basically this. Is the second law of thermodynamics a fundamental, basic law of physics, or does it emerge from more fundamental laws?


Let's say I was to write a massive computer simulation of our universe. I model every single sub-atomic particle with all their known behaviours, the fundamental forces of nature as well as (for the sake of this argument) Newtonian mechanics. Now I press the "run" button on my program - will the second law of thermodynamics become "apparent" in this simulation, or would I need to code in special rules for it to work? If I replace Newton's laws with quantum physics, does the answer change in any way?


FWIW, I'm basically a physics pleb. I've never done a course on thermodynamics, and reading about it on the internet confuses me somewhat. So please be gentle and don't assume too much knowledge from my side. :)



Answer



In thermodynamics, the early 19th century science about heat as a "macroscopic entity", the second law of thermodynamics was an axiom, a principle that couldn't be derived from anything deeper. Instead, physicists used it as a basic assumption to derive many other things about the thermal phenomena. The axiom was assumed to hold exactly.


In the late 19th century, people realized that thermal phenomena are due to the motion of atoms and the amount of chaos in that motion. Laws of thermodynamics could suddenly be derived from microscopic considerations. The second law of thermodynamics then holds "almost at all times", statistically – it doesn't hold strictly because the entropy may temporarily drop by a small amount. It's unlikely for entropy to drop by too much; the process' likelihood goes like $\exp(\Delta S/k_B)$, $\Delta S \lt 0$. So for macroscopic decreases of the entropy, you may prove that they're "virtually impossible".



The mathematical proof of the second law of thermodynamics within the axiomatic system of statistical physics is known as the Boltzmann's H-theorem or its variations of various kinds.


Yes, if you will simulate (let us assume you are talking about classical, deterministic physics) many atoms and their positions, you will see that they're evolving into the increasingly disordered states so that the entropy is increasing at almost all times (unless you extremely finely adjust the initial state – unless you maliciously calculate the very special initial state for which the entropy will happen to decrease, but these states are extremely rare and they don't differ from the majority in any other way than just by the fact that they happen to evolve into lower-entropy states).


What is the relationship between string theory and quantum field theory?


Please forgive a string theory novice asking a basic question.


Over at this question Luboš Motl gave an excellent answer, but he made a side comment that I've heard before and really would want to know more about:



Quantum field theory is the class of minimal theories that obey both sets of principles [Ed: SR and QM]; string theory is a bit more general one (and the only other known, aside from QFT, that does solve the constraints consistently).




What are the arguments that string theory is more general than QFT? I get that you can derive many different QFTs as low energy effective theories on different string backgrounds. But from my limited exposures to worldsheet perturbation theory and string field theory I can also see string theory as a very special kind of QFT. I also realize these approaches don't fully characterise string theory, but I don't know enough about their limitations to understand why the full definition of string theory (M theory?) surpasses QFT.


My naive guess would be that no, string theory can't be more general than QFT because surely there are many more QFTs which are asymptotically free/safe than could possibly come from string theory. Think ridiculously large gauge groups $SU(10)^{800}$ etc.. Since these theories don't need a UV completion into something else string theory can't be a more general framework than QFT. Logically you could also have theories which UV complete into something other than string theory (even if no such completion is presently known).


Alternately you could turn this around and say that string theory limits the kind of QFTs you can get in the low energy limit. This is saying that string theory is more predictive than QFT, i.e. less general. I always thought this was the goal the whole time! If it is the other way around and string theory really is more general than QFT, doesn't this mean that string theory is less predictive than, for instance, old school GUT model building?


So is the relationship between string theory and quantum field theory a strict inclusion $\mathrm{QFT} \subset \mathrm{ST}$ or more like a duality/equivalence $\mathrm{QFT} \simeq \mathrm{ST}$, or a more complicated Venn diagram?


Note that I am not asking about AdS/CFT as this only deals with special string backgrounds and their QFT duals. I'm asking about the general relationship between string theory and QFT.



Answer



Notice:





  1. Pertubative string theory is defined to be the asymptotic perturbation series which are obtained by summing correlators/n-point functions of a 2d superconformal field theory of central charge -15 over all genera and moduli of (punctured) Riemann surfaces.




  2. Perturbative quantum field theory is defined to be the asymptotic perturbation series which are obtained by applying the Feynman rules to a local Lagrangian -- which equivalently, by worldline formalism, means: obtained by summing the correlators/n-point functions of 1d field theories (of particles) over all loop orders of Feynman graphs.




So the two are different. But for any perturbation series one can ask if there is a local non-renormlizable Lagrangian such that its Feynman-rules reproduce the given perturbation series at sufficiently low energy. If so, one says this Lagrangian is the effective field theory of the theory defined by the original perturbation series (which, if renormalized, is conversely then a "UV-completion" of the given effective field theory).


Now one can ask which effective quantum field theories arise this way as approximations to string perturbation series. It turns out that only rather special ones do. For instance those that arise all look like anomaly-free Einstein-Yang-Mills-Dirac theory (consistent quantum gravity plus gauge fields plus minimally-coupled fermions). Not like $\phi^4$, not like the Ising model, etc.


(Sometimes these days it is forgotten that QFT is much more general than the gauge theory plus gravity plus fermions that is seen in what is just the standard model. QFT alone has no reason to single out gauge theories coupled to gravity and spinors in the vast space of all possible anomaly-free local Lagrangians.)


On the other hand now, within the restricted area of Einstein-Yang-Mills-Dirac type theories, it currently seems that by choosing suitable worldsheet CFTs one can obtain a large portion of the possible flavors of these theories in the low energy effective approximation. Lots of kinds of gauge groups, lots of kinds of particle content, lots of kinds of couplings. There are still constraints as to which such QFTs are effective QFTs of a string perturbation series, but they are not well understood. (Sometimes people forget what it takes to defined a full 2d CFT. It's more than just conformal invariance and modular invariance, and even that is often just checked in low order in those "landscape" surveys.) In any case, one can come up with heuristic arguments that exclude some Einstein-Yang-Mills-Dirac theories as possible candidates for low energy effective quantum field theories approximating a string perturbation series. The space of them has been given a name (before really being understood, in good tradition...) and that name is, for better or worse, the "Swampland".



For this text with more cross-links, see here:


http://ncatlab.org/nlab/show/string+theory+FAQ#RelationshipBetweenQuantumFieldTheoryAndStringTheory


quantum mechanics - Variational proof of the Hellmann-Feynman theorem


I use the following notation and definition for the (first) variation of some functional $E[\psi]$: \begin{equation} \delta E[\psi][\delta\psi] := \lim_{\varepsilon \rightarrow 0} \frac{E[\psi + \varepsilon \delta\psi] - E[\psi]}{\varepsilon}. \end{equation} For a Hamiltonian that depends on some parameter, $H(\lambda)$, which fulfills the eigenequation $H(\lambda) |\psi(\lambda)\rangle = E(\lambda) |\psi(\lambda)\rangle$, one can quite easily show that \begin{equation} \frac{dE(\lambda)}{d\lambda} = \langle \psi(\lambda) | \frac{dH(\lambda)}{d\lambda} | \psi(\lambda) \rangle \end{equation} which is called the Hellmann-Feynman theorem. Now in quantum chemistry, one often seeks an approximation of the true eigenfunctions by making the functional for the energy expectation value stationary, \begin{equation} E[\psi] = \frac{\langle \psi | H | \psi \rangle}{\langle \psi | \psi \rangle} \end{equation} \begin{equation} \delta E[\psi][\delta \psi] = \frac{1}{\langle \psi | \psi \rangle}\left[ \langle \delta\psi | H - E[\psi] | \psi\rangle + \langle \psi | H - E[\psi] | \delta\psi\rangle \right] = 0 \end{equation} for all variations $\delta\psi$ in some subspace (called the variational space) of the full Hilbert space. In this case, one often reads that the Hellmann-Feynman theorem is still valid for the approximate wave functions and energies.


Unfortunately, I have problems finding a clear proof of this statement in books. Wikipedia tries a proof (section "Alternate proof"), but it assumes that the functional derivative \begin{equation} \frac{\delta E[\psi]}{\delta\psi} \end{equation} exists, which is not the case because of the antilinearity in bra vectors, which leads to \begin{equation} \delta E[\psi][\alpha \cdot \delta \psi] \neq \alpha \cdot \delta E[\psi][ \delta \psi] \end{equation} for complex $\alpha$. Can anyone provide me with a clear justification of why the Hellmann-Feynman theorem works for variational wavefunctions?



Answer



Comment to the question (v2): One should not require the functional derivative $\frac{\delta E[\psi]}{\delta\psi}$ to be complex differentiable/holomorhic. That's an impossible requirement for a real functional $$E[\psi]~:=~ \frac{\langle \psi | H | \psi \rangle}{\langle \psi | \psi \rangle}~\in~ \mathbb{R}$$(apart from trivial cases), and not necessary. If we rewrite the wave function $\psi$ as a real column vector $$\begin{pmatrix} {\rm Re}\psi \cr {\rm Im}\psi\end{pmatrix}~\in~ \mathbb{R}^2,$$ then the functional derivative $\frac{\delta E[\psi]}{\delta\psi}$ should be interpreted as the corresponding row vector $$\left(\frac{\delta E[\psi]}{\delta{\rm Re}\psi}, \frac{\delta E[\psi]}{\delta{\rm Im}\psi} \right)~\in~ \mathbb{R}^2.$$


general relativity - evidence on the equation of state for dark energy?


If dark energy contributes mass-energy density $\rho$ and pressure $p$ to the stress-energy tensor, then you can define $w=p/\rho$, where $w=-1$ gives a cosmological constant, $w<-1$ gives a big rip, and $w<-1/3$ if we want to use dark energy to explain cosmological acceleration. The WP "Big Rip" article cites a paper that dates back to 2003 http://arxiv.org/abs/astro-ph/0302506 , which states that the empirical evidence was at that time only good enough to give $-2 \lesssim w \lesssim -.5$.


Have observations constrained $w$ any more tightly since 2003?



I've heard many people express the opinion that $w<-1$ is silly or poorly motivated, or that "no one believes it." What are the reasons for this? Other than mathematical simplicity, I don't know of any reason to prefer $w=-1$ over $w\ne -1$. Considering that attempts to calculate $\Lambda$ using QFT are off by 120 orders of magnitude, is it reasonable to depend on theory to give any input into what values of $w$ are believable?




quantum field theory - Regularization of the Casimir effect


For starters, let me say that although the Casimir effect is standard textbook stuff, the only QFT textbook I have in reach is Weinberg and he doesn't discuss it. So the only source I currently have on the subject is Wikipedia. Nevertheless I suspect this question is appropriate since I don't remember it being addressed in other textbooks


Naively, computation of the Casimir pressure leads to infinite sums and therefore requires regularization. Several regulators can be used that yield the same answer: zeta-function, heat kernel, Gaussian, probably other too. The question is:



What is the mathematical reason all regulators yield the same answer?



In physical terms it means the effect is insensitive to the detailed physics of the UV cutoff, which in realistic situation is related to the properties of the conductors used. The Wikipedia mentions that for some more complicated geometries the effect is sensitive to the cutoff, so why for the classic parallel planes example it isn't?


EDIT: Aaron provided a wonderful Terry Tao ref relevant to this issue. From this text is clear that the divergent sum for vacuum energy can be decomposed into a finite and an infinite part, and that the finite part doesn't depend on the choice of regulator. However, the infinite part does depend on the choice of regulator (see eq 15 in Tao's text). Now, we have another parameter in the problem: the separation between the conductor planes L. What we need to show is that the infinite part doesn't depend on L. This still seams like a miracle since it should happen for all regulators. Moreover, unless I'm confused it doesn't work for the toy example of a massless scalar in 2D. For this example, all terms in the vacuum energy sum are proportional to 1/L hence the infinite part of the sum asymptotics is also proportional to 1/L. So we have a "miracle" that happens only for specific geometries and dimensions




general relativity - When will the Hubble volume coincide with the volume of the observable Universe?


The Hubble volume is the volume that corresponds to objects so far from the Earth that the space between us and them is expanding faster than the speed of light. (I.e. objects outside this volume could never again be visible to us, even in principle.)


The volume of the observable Universe extends from us to the maximum distance light could have travelled since the Universe became transparent; when It was roughly 380,000 years old.


Since c1998 we have known the expansion of the Universe is accelerating, implying that the number of galaxies within the Hubble volume is decreasing. Since the Big Bang (well infinitesimally close to it at least) we know that time has been going forwards, and thus that the observable Universe is expanding.


When do these two volumes coincide with one another and what will the corresponding maximum volume of the observable Universe be at that time? The associated calculation or link to a suitable reference would also be very much appreciated.



Supplementarily/


I have often wondered about this; ever since reading Professor Sir Roger Penrose's Cycle's of Time a year or two ago. I thought about it again today after reading a somewhat unassociated article about how recent results had shown the accelerating expansion of the Universe cannot be explained by the "Hubble Bubble" hypothesis. Before asking this question I did of course search the site for an existing answer: this question is extremely similar but does not appear to include an exact answer (in fact the answers appear to somewhat contradict each other. Moreover, it does not appear to address the subtleties involved in making the estimate; such as the variation in inter-galactic recession speeds, due to the change in gravitational force between them, since the Big Bang.




classical mechanics - How much effect does the mass of a bicycle tire have on acceleration?


There are claims often made that, eg, "An ounce of weight at the rims is like adding 7 ounces of frame weight." This is "common knowledge", but a few of us are skeptical, and our crude attempts at working out the math seem to support that skepticism.


So, let's assume a standard 700C bicycle tire, which has an outside radius of about 36cm, a bike weighing 90Kg with bike and rider, and a tire+tube+rim weighting 950g. With the simplifying assumption that all the wheel mass is at the outer diameter, how much effect does adding a gram of additional weight at the outer diameter have on acceleration, vs adding a gram of weight to the frame+rider?



Answer



A few simplifying assumptions:




  • I'm going to ignore any rotational energy stored in the bike chain, which should be pretty small, and wouldn't change when you change bike tires

  • I'm going to use 50 cm for the radius of the bike tire. This is probably a little big, and your bike will likely have a different radius, but it makes my calculations easier, so there. I will include a formula nonetheless.

  • I'm going to assume that the rider provides a fixed torque to the wheels. This isn't strictly true, especially when the bike has different gears, but it simplifies our calculations, and, once again, the torque provided won't vary when you change the weight profile of the tire


OK, so now, let's analyze our idealized bicycle. We're going to have the entire $m$ of each of the two wheels concentrated at the radius $R$ of the tires. The cyclist and bicycle will have a mass $M$. The cycle moves forward when the cyclist provides a torque $\tau$ to the wheel, which rolls without slipping over the ground, with the no-slip conditions $v=R\omega$ and $a=\alpha R$ requiring a forward frictional force $F_{fr}$ on the bike.


Rotationally, with the tire, we have:


$$\begin{align*} I\alpha &= \tau - F_{fr}R\\ mR^{2} \left(\frac{a}{R}\right)&=\tau-F_{fr}R\\ a&=\frac{\tau}{mR} - \frac{F_{fr}}{m} \end{align*}$$


Which would be great for predicting the acceleration of the bike, if we knew the magnitude of $F_{fr}$, which we don't.


But, we can also look at Newton's second law on the bike, which doesn't care about the torque at all. There, we have (the factor of two comes from having two tires):



$$\begin{align*} (M+2m)a&=2F_{fr}\\ F_{fr}&=\frac{1}{2}(M+2m)a \end{align*}$$


Substituting this into our first equation, we get:


$$\begin{align*} a&=\frac{\tau}{mR}-\frac{1}{m}\frac{(M+2m)a}{2}\\ a\left(1+\frac{M}{2m} +1\right)&=\frac{\tau}{mR}\\ a\left(\frac{4m+M}{2m}\right)&=\frac{\tau}{mR}\\ a&=\frac{2\tau}{R(4m+M)} \end{align*}$$


So, now, let's assume a 75 kg cyclist/cycle combo and a 1 kg wheel, and a 0.5 m radius for our wheel. This gives $a=0.0506 \tau$. Increasing the mass of the cyclist by 1 kg results in the acceleration decreasing to $a=0.0500 \tau$. Increasing the mass of the wheels by 0.5 kg each results in the acceleration decreasing to $a=0.0494$, or roughly double the effect of adding that mass to the rider/frame.


This result, e.g. an ounce of weight at the rims is like adding two ounces of frame weight is true regardless of the mass of the cyclist/cycle, wheel radius or rider torque. To see this, note that $$\begin{align*} \frac{da}{dm} &=\frac{-8\tau}{R(4m+M)^2}\\ \frac{da}{dM} &=\frac{-2\tau}{R(4m+M)^2} \end{align*}$$ Adding a small amount of mass $\delta m$ to the frame changes the acceleration by $\delta m \frac{da}{dM}$ while adding half this amount to each of the wheels changes the acceleration by $\frac{1}{2} \delta m \frac{da}{dm}$. The ratio of acceleration changes is $$\frac{\frac{1}{2} \frac{da}{dm}}{\frac{da}{dM}} = 2$$ regardless of the other parameter values. It's not hard to see that this result is true for unicycles and trikes as well (i.e. doesn't depend on the number of wheels on the cycle).


Wednesday, 28 October 2020

spacetime - A Sphere of Black Holes


Imagine a sphere of black holes surrounding a piece of space. Will this piece be separated from the rest of normal spacetime (at least for some time, till these black holes finally attracted themselves).


So, seen from the outside, we have a black hole, but with a non-singular interior.



Is this possible?



Answer



The radius of the event horizon of a black hole of mass $m$ is given by:


$$ r_s = \frac{2GM}{c^2} \tag{1} $$


Let's consider your idea of taking $n$ black holes of mass $M$ and arranging them into a sphere. The total mass is $nM$, and the radius of the event horizon corresponding to this mass is:


$$ R_s = n\frac{2GM}{c^2} \tag{2} $$


Now let's see how closely we have to pack our black holes to get them to form a spherical surface with their event horizons overlapping. The cross sectional area of a single black hole is $\pi r_s^2$, and since we have $n$ of them their total cross sectional area is just $n \pi r_s^2$. the surface area of a sphere of radius $R$ is $4\pi R^2$, and we can get a rough idea of $R$ by just setting the areas equal:


$$ 4\pi R^2 = n \pi r_s^2 $$


Giving us:


$$ R = \frac{\sqrt{n}}{2}r_s $$



Use equation (1) to substitute for $r_s$ and we find that the radius of our sphere of packed black holes is:


$$ R = \frac{\sqrt{n}}{2}\frac{2GM}{c^2} \tag{3} $$


Bit if you compare equations (2) and (3) you find that $R < R_s$ because $\sqrt{n}/2 < n$. That means when you try and construct the sphere of black holes that you imagine you won't be able to do it. An event horizon will form before you can get the individual black holes to overlap. You won't be able to construct the black shell that you want and it's impossible to trap a normal bit of space inside a shell of black holes.


However there is a situation a bit like the one you're thinking about, and it's called the Reissner–Nordström metric. A normal black hole has just the single event horizon, but if you electrically charge the black hole you get a geometry with two event horizons, and inner one and an outer one. When you cross the outer horizon you enter a region of spacetime where time and space are switched round, just as in an uncharged black hole, and you can't resist falling inwards towards the second horizon. However when you cross the second horizon you're back into normal space. You can choose a trajectory that misses the singularity and travels outwards through both horizons so you re-emerge from the black hole. If you're interested I go into this further in my answer to Entering a black hole, jumping into another universe---with questions.


As for what the spacetime inside the second horizon looks like, well it's just spacetime. It's highly curved spacetime, but there's nothing extraordinary about it.


electrostatics - Confusion in understanding the proof of Uniqueness Theorem


I am having problems in comprehending the proof of contradiction used by Purcell in his book;



...We can now assert that $W^1$ must be zero at all points in space. For if it is not, it must have a maximum or minimum somewhere-remember that $W$ is zero at infinity as well as on all the conducting boundaries. If $W$ has an extremum at some point $P$, consider a sphere centered on that point. As we saw in chapter2, the average over a sphere of a function that satisfies Laplace's equation is equal to its value at the center. This could not be true if the center is extremum; it must therefore be zero everywhere. $^1 W = \phi(x,y,z) - \psi(x,y,z)$, where the former term is the deduced solution & the later term is the assumed solution in order to proof contradiction.



Extremum means local maximum or local minimum, right? Why can't average be equal to an extrmum value? If it is not equal to the average value, how does it ensure that at all the places $W$ is equal?



Answer



Well, if you have an extremum (say local maximum) of $W$ at $p$, then you have a small open ball $N$ centered in $p$ such as $W(x)< W(p)$ for some $x\ne p$ in $N$. Therefore $$ \frac{1}{{\rm vol}(N)}\int_N W(x)dx < W(p)~~.\tag{1}$$ assuming we have taken the ball $N$ small enough for $W(x)

spacetime - What is "a general covariant formulation of newtonian mechanics"?


I am a little confused: I read that there are general covariant formulations of Newtonian mechanics (e.g. here). I always thought:


1) A theory is covariant with respect to a group of transformations if the form of those equations is conserved.


2) general covariance means not only transformations defined by arbitrary velocities between different systems, but also transformations defined by arbitrary accelerations conserve the form of such equations.


But in that case the principle of general relativity (the form of all physical laws must be conserved under arbitrary coordinate transformations) would not be unique to general relativity. Where is my error in reasoning, or stated differently which term do I misunderstand?


Regards and thanks in advance!




Answer



The development of general relativity has led to a lot of misconceptions about the significance of general covariance. It turns out that general covariance is a manifestation of a choice to represent a theory in terms of an underlying differentiable manifold.


Basically, if you define a theory in terms of the geometric structures native to a differentiable manifold (i.e. tangent spaces, tensor fields, connections, Lie derivatives, and all that jazz), the resulting theory will automatically be generally-covariant when expressed in coordinates (guaranteed by the manifold's atlas).


It turns out that most physical theories can be expressed in this language (e.g. symplectic manifolds in the case of Hamiltonian Mechanics) and can therefore be presented in a generally covariant form.


What turns out to be special(?) about the general theory of relativity is that space and time combine to form a (particular type of) Lorentzian manifold and that the metric tensor field on the manifold is correlated with the stuff occupying the manifold.


In other words, general covariance was not the central message of general relativity; it just seemed like it was because it was a novelty at the time, and a poorly understood one at that.


Tuesday, 27 October 2020

space - What is the possibility of a railgun assisted orbital launch?


Basic facts: The world's deepest mine is 2.4 miles deep. Railguns can acheive a muzzle velocity of a projectile on the order of 7.5 km/s. The Earth's escape velocity is 11.2 km/s.


It seems to me that a railgun style launch device built into a deep shaft such as an abandoned mine could reasonably launch a vehicle into space. I have not run the calculations and I wouldn't doubt that there might be issues with high G's that limit the potential for astronauts on such a vehicle, but even still it seems like it would be cheaper to build such a launch device and place a powerplant nearby to run it than it is to build and fuel single-use rockets.


So, what is the possibility of a railgun assisted orbital launch?


What am I missing here? Why hasn't this concept received more attention?



Answer



Ok David asked me to bring the rain. Here we go.


Indeed it is very feasible and very efficient to use an electromagnetic accelerator to launch something into orbit, but first a look at our alternative:





  • Space Elevator: we don't have the tech




  • Rockets: You spend most of the energy carrying the fuel, and the machinery is complicated, dangerous, and it cannot be reused (no orbital launch vehicle has been 100% reusable. SpaceShipOne is suborbital, more on the distinction in a moment). Look at the SLS that NASA is developing, the specs aren't much better than the Saturn V and that was 50 years ago. The reason is that rocket fuel is the exact same - there is only so much energy you can squeeze out of these reactions. If there is a breakthrough in rocket fuel that is one thing but as there has been none and there is none on the horizon, rockets as an orbital launch vehicle are dead end techs which we have hit the pinnacle of.




  • Cannons: Acceleration by a pressure wave is limited to the speed of sound in the medium, so you cannot use any explosive as you will be limited by this (gunpowder is around $2\text{ km/s}$ , this is why battleship cannons have not increased in range over the last 100 years). Using a different medium you can achieve up to 11km/s velocity using hydrogen. This is the regime of 'light gas guns' and a company wants to use this to launch things into orbit. This requires high accelerations ( something ridiculous like thousands of $\mathrm{m/s^2}$) which restricts you to very hardened electronics and material supply such as fuel and water.




  • Maglev: Another company is planning on this (http://www.startram.com/) but if you look at their proposal it requires superconducting loops running something like 200MA generating a magnetic field that will destroy all communications in several states, I find this unlikely to be constructed.





  • Electromagnetic accelerator (railgun): This is going to be awesome! There is no requirement on high accelerations (A railgun can operate at lower accelerations) and no limit on upper speed. See the following papers:



    Some quick distinctions, there is suborbital and orbital launch. Suborbital can achieve quite large altitudes which are well into space, sounding rockets can go up to 400miles and space starts at 60miles. The difference is if you have enough tangential velocity to achieve orbit. For $1\text{ kg}$ at $200\text{ km}$ from earth the energy to lift it to that height is $0.5 m g h = 1\text{ MJ}$, but the tangential velocity required to stay in orbit is $m v^2 / r = G m M / r^2$ yielding a $KE = 0.5 m v^2 = 0.5 G m M / r = 30\text{ MJ}$ , so you need a lot more kinetic energy tangentially. To do anything useful you need to be orbital, so you don't want to aim your gun up you want it at some gentle angle going up a mountain or something.


    The papers I cited all have the railgun going up a mountain and about a mile long and launching water and cargo. That is because to achieve the $6\text{ km/s}+$ you need for orbital velocity you need to accelerate the object from a standstill over the length of your track. The shorter the track the higher the acceleration. You will need about 100 miles of track to drop the accelerations to within survival tolerances NASA has.


    Why would you want to do this? You just need to maintain the power systems and the rails, which are on the ground so you can have crews on it the whole time. The entire thing is reusable, and can be reused many times a day. You can also just have a standard size of object it launches and it opens a massive market of spacecraft producers, small companies that can't pay 20 million for a launch can now afford the 500,000 for a launch. The electric costs of a railgun launch drops to about 3\$/kg, which means all the money from the launch goes to maintenance and capital costs and once the gun is paid down prices can drop dramatically. It is the only way that humanity has the tech for that can launch large quantities of object and in the end it is all about mass launched.


    Noone has considered having a long railgun that is miles long because it sounds crazy right off the bat, so most proposals are for small high-acceleration railguns as in the papers above. The issue is that this limits what they can launch and as soon as you do that noone is very much interested. Why is a long railgun crazy? In reality it isn't, the raw materials (aluminum rails, concrete tube, flywheels, and vacuum pumps) are all known and cheap. If they could make a railroad of iron 2000miles in the 1800s why can't we do 150miles of aluminum in the 2000s? The question is of money and willpower, someone needs to show that this will work and not just write papers about this but get out there and do it if we ever have a hope of getting off this rock as a species and not just as the 600 or so that have gone already. Also the large companies and space agencies now are not going to risk billions into a new project while there is technology which has been perfected and proven for the last 80 years that they could use. There are a lot of engineering challenges, some of which I and others have been working on in our spare time and have solved, some which are still open problems. I and several other scientists who are finishing/have recently finished their PhDs plan on pursuing this course ( jeff ross and josh at solcorporation.com , the website isn't up yet because I finished my PhD 5 days ago but it is coming).






Yes it is possible, the tech is here, it is economic and feasible to launch anything from cargo to people. It has not gotten a lot of attention because all the big boys use rockets already, and noone has proposed a railgun that can launch more than cargo. But it has caught the attention of some young scientists who are going to gun for this, so sit back and check the news in a few years.


atomic physics - Interaction between electrons in which the magnetic dipole moments interact more strongly than their electric fields


Asking a question Has anyone tried to incorporate the electrons magnetic dipole moment into the atomic orbital theory?, I was curious whether anyone has attempted to relate the intrinsic property of the magnetic moment of the electron to the above-mentioned properties of spin.


In the extremely detailed answer (thanks to the author, who took the time despite the pointlessness of such a question) it is clarified that



The effects are weak, and they are secondary to all sorts of other interactions that happen in atoms,...


Also, in case you're wondering just how weak: this paper calculates the energy shifts coming from electron spin-spin coupling for a range of two-electron systems. The largest is in helium, for which the coupling energy is of the order of $\sim 7 \:\mathrm{cm}^{-1}$, or about $0.86\:\rm meV$, as compared to typical characteristic energies of $\sim 20\:\rm eV$, some five orders of magnitude higher, for that system.




Now there is a new question about Electron to electron interaction.



There is a critical distance


$$d_\text{crit}=\sqrt\frac{3\epsilon_0\mu_0\hbar^2}{2m^2}=\sqrt{\frac{3}{2}}\frac{\hbar c}{m}=\sqrt{\frac{3}{2}}\overline\lambda_C,$$


where $\overline\lambda_C$ is the reduced Compton wavelength of the electron, at which the two forces are equal in magnitude.


Since the Compton wavelength is a standard measure of where quantum effects start to be important, this classical analysis can't be taken too seriously. But it indicates that spin-spin interactions are important at short distances.



I wonder how these two points of view can be related.



Answer



They can be related by the fact that the Bohr radius of hydrogen is $1/\alpha\approx 137$ times larger than the reduced Compton wavelength of the electron. (Here $\alpha$ is the fine-structure constant. For helium, divide by 2 to get 68.5.) At this large a separation between the proton and electron, the magnetic interaction that I calculated is small compared to the electrostatic interaction.



mathematical physics - Are there physical theories that require the axiom of choice to be "true" to work?


I have been wondering about the axiom of choice and how it relates to physics. In particular, I was wondering how many (if any) experimentally-verified physical theories require the axiom of choice (or well-ordering) and if any theories actually require constructability. As a math student, I have always been told the axiom of choice is invoked because of the beautiful results that transpire from its assumption. Do any mainstream physical theories require AoC or constructability? If so, how do they require AoC or constructability?



Answer



No, nothing in physics depends on the validity of the axiom of choice because physics deals with the explanation of observable phenomena. Infinite collections of sets – and they're the issue of the axiom of choice – are obviously not observable (we only observe a finite number of objects), so experimental physics may say nothing about the validity of the axiom of choice. If it could say something, it would be very paradoxical because axiom of choice is about pure maths and moreover, maths may prove that both systems with AC or non-AC are equally consistent.


Theoretical physics is no different because it deals with various well-defined, "constructible" objects such as spaces of real or complex functions or functionals.


For a physicist, just like for an open-minded evidence-based mathematician, the axiom of choice is a matter of personal preferences and "beliefs". A physicist could say that any non-constructible object, like a particular selected "set of elements" postulated to exist by the axiom of choice, is "unphysical". In mathematics, the axiom of choice may simplify some proofs but if I were deciding, I would choose a stronger framework in which the axiom of choice is invalid. A particular advantage of this choice is that one can't prove the existence of unmeasurable sets in the Lebesgue theory of measure. Consequently, one may add a very convenient and elegant extra axiom that all subsets of real numbers are measurable – an advantage that physicists are more likely to appreciate because they use measures often, even if they don't speak about them.


x rays - Do free-electron lasers actually lase?


Free-electron lasers are devices which use the motion of highly energetic electron beams to produce bright, coherent radiation in the x-ray regime. More specifically, they start with a high-energy electron beam and feed it into an undulator, which is an array of alternating magnetic fields designed to make the electron beam move in a 'zigzag' path with sharp turns on either side, emitting synchrotron radiation during each turn.



The radiation thus produced is added up over successive turns, and it is produced coherently via self-amplified spontaneous emission.



One common question frequently posed of this setup is: is this actually a laser? That is, does it have population inversion, and is the radiation actually emitted via stimulated emission in some suitable understanding of it? People in the know tend to answer in the affirmative, but I've yet to see a nice explanation of why - and definitely not online. So: do FELs actually lase?



Answer



You are missing a crucial aspect of the dynamics of a Free Electron Laser: microbunching. This comes from the fact that although electrons at different energies share basically the same velocity $c$, they have different oscillation amplitudes in the undulator, therefore they shift longitudinally.


Since you mentioned the SASE mechanism let me expand around it: the noise in the initial electron distribution will guarantee you some peaks which will start to radiate coherently (the power going with $N^2$). As the radiation slips through the bunch (remember that photons go straight, while electrons are wiggled) it exchanges energy with it, triggering energy modulations. But, as we saw before, these result in a longitudinal shift therefore we get additional density modulations at the radiation wavelength.


The result is that your initially long bunch slowly splits in a number of very short (micro) bunches, all of them radiating coherently, with a great boost of the radiation intensity gain. Of course the microbunching can not continue endlessly, indeed it comes to saturation, where the dynamics becomes strongly non linear and the gain is stopped.


While microbunching develops, the radiation increases up to saturation.


Therefore for sure you have light amplification by stimulated emission of radiation: the initial radiation stimulates microbunching leading to even more radiation, obviously this is not the standard interpretation of atomic physics, but in common English it fits perfectly. The population inversion might be seen it in the microbunching factor: the initial, more or less uniform, beam is completely "inverted", while, as long as the microbunching develops this is lost up to the saturation.


For some read up (and also source of the nice picture): FEL@Desy.de (use the menu on the left)


Monday, 26 October 2020

quantum mechanics - Basic question on the Aharonov-Bohm effect


I have a very basic question on the Aharonov-Bohm effect.


Why is the curve integral $\oint_\Gamma {A}\cdot d{r}$ non-zero ? $\Gamma$ is the "difference" of both paths $P_1$ and $P_2$. If the magnetic field is limited to the interior of the solenoid $\operatorname{curl} {A}=0$ along the integral path $\Gamma$, so I can conclude that I can write ${A}=\nabla f$. A closed curve integral of a gradient function is zero.


I guess it is related with a possible singularity of $A$ in the very center of the solenoid.


Nevertheless if I travel around a point-like source of a gravitational field and compute the integral $\oint_\Gamma {F}\cdot d{r}$ where $F=-\nabla V(r)$ the closed curve integral over an conservative force field is certainly zero, whereas $V(r)$ even has a singularity and $F$ consequently too. I would be very grateful for an explanation.



Answer




From $$\operatorname{curl} A = 0 \tag{1} $$ in a region $U$, you can in general not conclude that $$A = \nabla f \tag{2}$$ for some function $f$ defined on all of $U$. Indeed this is related to the singularity, which removes a line through the origin. The degree to which (1) fails to imply (2) depends on the topology of $U$, more specifically it's de Rham cohomology. (1) implies (2) iff the de Rham cohomology is trivial. For $U = \mathbb{R} \times (\mathbb{R^2} \setminus \{ 0 \} )$, it turns out that the cohomology is not trivial, and hence your integral need not vanish.


To show that (1) does not imply (2) for 3-space with a line removed, we can assume that the line is the $z$-axis and, consider $$A = \begin{cases} \nabla \operatorname{arctan2} (y,x) & (x,y) \neq (0,-1) \\ (0, -1, 0 ) & (x,y) = (0,-1) \end{cases}$$ where also $(x,y) \neq (0,0)$. $A$ is smooth, and $\operatorname{curl} A = 0$, but you cannot find a function $f$, continuous away from the $z$-axis, such that $\nabla f = A$ everywhere.


It less trivial (or rather, one first needs to construct some sledgehammers, and then it's a 2-line proof) to show that for 3-space with a point (1) does indeed imply (2).




Update: If you want to understand why a point can be removed but not a line, one can think of it as needing to make a hole that is big enough. Suppose we have two curves $\gamma_1$ and $\gamma_2$ and $\int_{\gamma_i} A$, where $\operatorname{curl} A = 0$. If $\gamma_1$ can be deformed to $\gamma_2$ one can apply Stokes's theorem to argue that the integrals are equal. Suppose that the space is such that any curve can be deformed into any other curve. Then a standard prescription exists to find $f$: pick any point $x_0$ and let $$f(x) = \int_\gamma A $$ where $\gamma$ is any curve connecting $x_0$ and $x$. By the assumption the integral doesn't depend on the particular $\gamma$ chosen so $f$ is unambiguously defined. To make the argument fail you have to make a "hole" big enough that curves can't be deformed into each other. Removing a point isn't enough in three dimensions, but a line is: make a loop around the line singularity, you can't get it to not encircle the line without dragging it across the singularity. (In 2 dimensions a point is enough, because you can consider the projection of this example onto a plane.)


cosmology - How can the universe expand if there is gravitation?


We live in an expanding universe - so I'm told. But how can that be possible? Everything imaginable is attracted by a bigger thing. So, why can't gravitation stop the expansion of the universe? I know the "Big Bang" theory, but is it possible that the expansion of the universe is caused by the attraction of a bigger object?




general relativity - Shape of the universe?




  1. What is the exact shape of the universe? I know of the balloon analogy, and the bread with raisins in it. These clarify some points, like how the universe can have no centre, and how it can expand equally everywhere in all directions.





  2. But they also raise some questions, like if you are on the surface of a balloon and travel in 1 direction, you will eventually return to your starting point, is it possible our universe have this feature?




  3. If it has, or had, would this be a symmetry of sorts ($\psi(x)=\psi(x+R)$), and as such have a conserved quantity associated with it (by Noether)?




  4. Assuming "small curled up dimensions" wouldn't these dimensions have this type of symmetry, what are the associated conserved quantities?





  5. Is it known exactly what the geometrical shape of the universe is? (on a large scale) (I am not talking about only the observable universe).




  6. How does one define the "size" of a dimension, is this scale only applicable to curled up ones?




  7. Is it possible to describe to a layman the shape of the universe without resorting to inept analogies?





Answer




There are a bunch of questions here. Let me try to take them in order:



  • Is it possible that our Universe has the feature that if you travel far enough you return to where you started?


Yes. The standard Big-Bang cosmological model is based on the idea that the Universe is homogeneous and isotropic. One sort of homogeneous spacetime has the geometry of a 3-sphere (like a regular sphere, but with one more dimension). In these cosmological models, if you travel far enough you get back to where you started.


However, the best available data seem to indicate that the Universe is very nearly spatially flat. This means that, if we do live in a 3-sphere Universe, the radius of the sphere is very large, and the distance you'd have to travel is much larger than the size of the observable Universe. Even if that weren't true, the fact that the Universe is expanding would make it hard or impossible to circumnavigate the Universe in practice: no matter how fast you went (short of the speed of light), you might never make it all the way around. Nonetheless, 3-sphere Universes, with the geometrical property you describe, are definitely viable cosmological models.



  • Does this give rise to a symmetry by Noether's theorem?


Not really. Noether's theorem is generally applied to continuous symmetries (i.e., ones that can be applied infinitesimally), not discrete symmetries like this. The fact that space is homogeneous gives rise to a symmetry, namely momentum conservation, whether or not space has the 3-sphere geometry, but the symmetry you're talking about here doesn't give rise to anything extra.




  • Would small curled up dimensions have the same sort of symmetry?


I'll leave this for someone else, I think. Not my thing.



  • Is it known exactly what the geomtrical shape of the universe is?


No, and don't let anyone tell you otherwise! Sometimes, especially in pop-science writing, people imply that we know a lot more about the global properties of the Universe than we do. We often assume things like homogeneity to make our lives simpler, but in fact we have precisely no idea what things are like outside of our horizon volume.



  • How to describe the "size" of a dimension?



If the Universe's geometry has enough symmetries, it makes sense to define an overall time coordinate everywhere. Then it makes sense to imagine a "slice" through spacetime that represents the Universe at an instant of time. If some of those slices have the geometrical property you're talking about, that traveling a distance R in a certain direction gets you back to your starting point, then it makes sense to call R the "size" of the corresponding dimension. If you can travel forever, then we say the size in that dimension is infinite.



  • Is it possible to describe to a layman the shape of the universe without resorting to inept analogies?


All analogies are imperfect. I think the best you can do is use a bunch of them and try to convey the limitations of each.


Understanding Stagnation point in pitot fluid

What is stagnation point in fluid mechanics. At the open end of the pitot tube the velocity of the fluid becomes zero.But that should result...