There are proofs in the literature that QFT including microcausality is sufficient for it not to be possible to send signals by making quantum mechanical measurements associated with regions of space-time that are space-like separated, but is there a proof that microcausality is necessary for no-signaling?
I take microcausality to be trivial commutativity of measurement operators associated with space-like separated regions of space-time. In terms of operator-valued distributions $\hat\phi(x)$ and $\hat\phi(y)$ at space-like separated points $x$ and $y$, $\left[\hat\phi(x),\hat\phi(y)\right]=0$.
An elementary observation is that classical ideal measurements satisfy microcausality just insofar as all ideal measurements are mutually commutative, irrespective of space-like, light-like, or time-like separation. For a classical theory, ideal measurements have no effect either on the measured system or on other ideal measurements, so ideal measurements cannot be used to send messages to other ideal experimenters. Whether a classical theory admits space-like signaling within the measured system, in contrast to signaling by the use of ideal measurements applied by observers who are essentially outside the measured system, is determined by the dynamics. Applying this observation to QFT, I note that microcausality is not sufficient on its own —without any dynamics, which I take to be provided in QFT by the spectrum condition— to prove no-signaling at space-like separation.
As an auxiliary question, what proofs of sufficiency do people most often cite and/or think most well-stated and/or robust?
I will try to give a series of counterexamples, of decreasing triviality:
No signalling plus little external agents implies microcausility
As you point out in the question, when you have arbitrarily tiny external agents capable of measuring any bosonic field in an arbitrarily tiny region, then microcausality is obviously necessary for no signaling, since if you have two noncommuting operators A and B associated with two tiny spacelike separated regions, and two external agents wants to transmit information from A's region to B's, the agent can either measure A repeatedly or not, while another agent measures B a few times to see if A is being measured. The B measurements will have a probability of giving different answers, which will inform the B agent about the A measurement.
This is the motivation for microcausility, and for the purposes of physics, the existence of semiclassical black holes means that you have classical point probes at any distance significantly larger than the Planck length, and microcausality is necessary at least for these scales.
This point is adressed in your question. From now on, I will ask the intrinsic question--- can observers in the theory signal using devices built up out of the fields in the theory, not using external probes, so that the question is nontrivial.
Two space no-gravity QFT
Consider a quantum field theory with a bad localization. The theory is defined using a spacelike shift vector $\Delta$ and the Lagrangian gets with a displaced interaction
$$ S= \int d^4x L_1(\phi) + L_2(\chi) + \phi(x)\chi(x+\Delta)$$
Where the L's are some translationally invariant local actions for $\phi$ and $\chi$, and the interaction mixes $\phi$ and $\chi$ at displaced points. This is clearly ridiculous-- the field $\chi$ has been misplaced, the correct local field associated with a given point x is $\chi(x+\Delta)$, not $\chi(x)$, but the point is that you can define an algebra of observables using this completely wrong localization, and then microcausality obviously fails, because $\chi$ and $\phi$ are at the wrong point. But because there is a change of variables which makes microcausality work, there is no signalling for objects in the theory, intrinsically (although for an external agent capable of making local measurements of $\phi(x)+\chi(x)$, no signalling would fail). So the question should be better stated "Does there have to be some collection of field variables which obey microcausility for no-signalling to work".
Curved Extra dimensions
Suppose you have a warped extra dimension, so that in 5+1 dimensions you have fields which are local, but the background is not a product. Then you can consider the theory as a 4 dimensional quantum field theory, and in this framework, try to identify mutually local four-dimensional fields. This doesn't work, not in a way consistent with Lorentz invariance, because time ticks at a different rate at different positions in the extra dimension, so that if you choose the fields to obey 4 dimensional Lorentz invariance.
But if the shortest distance between two points on your brane-world is a straight line on the brane-world, then no signalling still holds. Gubser examines this situation in a recent preprint (http://arxiv.org/PS_cache/arxiv/pdf/1109/1109.5687v2.pdf) with an eye to reproducing the OPERA neutrino no-signalling claimed violations, and he says that the effective 4d theory only violates no signalling when the 5d theory violates of the weak energy condition. But the 5d theory will violate any attempted identification of a 4d microcausality generically.
Emergent dimensions
I think that the best examples of where microcausality can fail, and still there is no intrinsic signalling, are within string theory. This is not quantum field theory, so it might not be included, but it is the starkest example of a nonlocal theory where no-signalling (presumably) works, but there is no microcausility, because there aren't local fields in the bulk.
In AdS/CFT The bulk theory is defined by a holographic projection of the boundary fields, and if you have N=4 gauge theory on the boundary, you only have boundary microcausality. You can define effective local fields in the bulk, which create a string excitation, but these fields will not commute at the string scale, since the strings are extended, and they are not fundamental things anyway, their localization is at a center of mass.
So in my opinion, the best answer is no, although the answer might as well be yes outside of the quantum gravity regime, because at larger scales, tiny black holes can be used as point probes to make local measurements of fields.