Measuring Gravity: Ain't Nothin' but a G Thing

« Guess-the-Lyrics: Unusual Rhymes II | Main

Measuring Gravity: Ain't Nothin' but a G Thing

Category: ExperimentPhysicsResearchBloggingScience
Posted on: August 26, 2010 12:25 PM, by Chad Orzel


There's a minor scandal in fundamental physics that doesn't get talked about much, and it has to do with the very first fundamental force discovered, gravity. The scandal is the value of Newton's gravitational constant G, which is the least well known of the fundamental constants, with a value of 6.674 28(67) x 10-11 m3 kg-1 s-2. That may seem pretty precise, but the uncertainty (the two digits in parentheses) is scandalously large when compared to something like Planck's constant at 6.626 068 96(33) x 10-34 J s. (You can look up the official values of your favorite fundamental constants at this handy page from NIST, by the way...)

To make matters worse, recent measurements of G don't necessarily agree with each other. In fact, as reported in Nature, the most recent measurement, available in this arxiv preprint, disagrees with the best previous measurement by a whopping ten standard deviations, which is the sort of result you should never, ever see.

This obviously demands some explanation, so:

What's the deal with this? I mean, how hard can it be to measure gravity? You drop something, it falls, there's gravity. It's easy to detect the effect of the Earth's gravitational pull, but that's just because the Earth has a gigantic mass, making the force pretty substantial. If you want to know the precise strength of gravity, though, which is what G characterizes, you need to look at the force between two smaller masses, and that's really difficult to measure.

Why? I mean, why can't you just use the Earth, and measure a big force? If you want to know the force of gravity to a few parts per million, you would need to know the mass of the Earth to better than a few parts per million, and we don't know that. A good measurement of G requires you to use test masses whose values you know extremely well, and that means working with smaller masses. Which means really tiny forces-- the force between two 1 kg masses separated by 10 cm is 6.6 x 10-9 N, or about the weight of a single cell.

OK, I admit, that's a bit tricky. So how do they do it? There are four papers cited in the Nature news article. I'll say a little bit about each of them, and how they figure into this story.

The oldest measurement cited by Nature is the torsion balance measurement from 2000 by the Eöt-Wash group at the University of Washington. This is an extremely refined version of the traditional method of measuring G first developed by Henry Cavendish in the late 1700's.

Cavendish1.gifLet's assume I'm too lazy to follow that link, and summarize in this post, mmmkay? OK. Cavendish's method used a "torsion pendulum," which is a barbell-shaped mass hung at the end of a very fine wire, as seen at right. You put two test masses near the ends of the barbell, and they attract the barbell, causing the wire to twist. The amount of twist you get depends on the force, so by measuring the twist of the wire for different test masses and different separations, you can measure the strength of gravity and its dependence on distance.

Sounds straightforward enough. It is, in concept. Of course, given the absurdly tiny size of the forces involved, it's a really fiddly measurement to do. Cavendish himself set the apparatus up inside a sealed room, and then read the twist off from outside, using a telescope. If he was in the room looking at the apparatus, the air currents created by his presence were enough to throw things off.

This remained the standard technique for G measurements for about two centuries, though, because it's damnably difficult to do better. And the Eöt-Wash group's version is really astounding.

So, how did they do better? One of the biggest sources of error in the experiment comes from the twisting of the wire. In an ideal world, the response of the wire would be linear-- that is, if you double the force, you double the twist. In the real world, though, that's not a very good assumption, and that makes the force measurement really tricky if the wire twists at all.


The great refinement introduced by the Eöt-Wash group was to not allow the wire to twist. They mounted their pendulum, shown at left, on a turntable, and made small rotations of the mount as the wire started to twist, to prevent the twist from becoming big. Their force measurement was then determined by how much they had to rotate the turntable to compensate for the gravitational force causing a twist of the wire.

They also mounted the attracting masses on a turntable, and rotated it in the opposite direction around the pendulum, to avoid any systematic problems caused by the test masses or their positioning. Their signal was thus an oscillating correction signal, as each test mass passed by their pendulum, and they recorded data for a really long time: their paper reports on six datasets, each containing three days worth of data acquisition.

The value they got was 6.674 215 6 ± 0.000092 x 10-11 m3 kg-1 s-2, far and away the best measurement done to that point.

beam_balance-G.jpgSo what are the other papers? The second one, in chrononogical order, is a Phys. Rev. D paper from a group in Switzerland, who used a beam balance to make their measurement. They had two identical test masses hung from fine wires, and they alternately weighed each mass while moving enormous "field masses" weighing several metric tons each into different positions, as shown in the figure at right. In the "T" configuration, the upper test mass should appear heavier than the lower test mass, as the large field masses between them pull one down and the other up. In the "A" configuration, the upper test mass should be lighter, as the field masses pull it up while pulling the lower mass down.

This was another experiment with very long data taking, including this wonderfully deadpan description:

The equipment was fully automated. Measurements lasting up to 7 weeks were essentially unattended. The experiment was controlled from our Zurich office via the internet with data transfer occurring once a day.

Their value, 6.674 252(109)(54) x 10-11 m3 kg-1 s-2 is in good agreement with the Eöt-Wash group's result.

If it agrees, why even mention it? It's an important piece of the story, because it's a radically different technique, giving the same answer. It's extremely unlikely that these would accidentally come out to be the same, because the systematic effects they have to contend with are so very different.

balance_China_G.jpgYeah, great. Get to the disagreement. OK, OK. The third measurement, in this PRL by a group in China, uses a pendulum again, but a different measurement technique. They used a rectangular quartz block as their pendulum, suspended by the center, with test masses outside the pendulum. They place these test masses in one of two positions: near the ends of the pendulum when it was at rest (shown in the figure), or far away from the ends (where the "counterbalancing rings" are in the figure).

The gravitational attraction of the masses in the near configuration makes the pendulum twist at a slightly different rate than in the far configuration, and that's what they measured. The oscillation period was almost ten minutes, and the difference between the two was around a third of a second, which gives you some idea of how small an effect you get.

Their value was 6.673 49(18) x 10-11 m3 kg-1 s-2, which is a significantly larger uncertainty than the other two, but even with that, doesn't agree with them. Which is kind of a problem.

So, how do you deal with that? Well, they obviously had a little trouble getting the paper through peer review-- it says it was first submitted in 2006, but not published until 2009. That probably means they needed to go back and re-check a bunch of their analysis to satisfy the referees that they'd done everything correctly. After that, though, all you can do is put the result out there, and see what other people can make of it.

Which brings us to the final paper? Exactly. This is an arxiv preprint, and thus isn't officially in print yet, but it has been accepted by Physical Review Letters.

hanging_G.jpgThey use yet another completely different technique, this one employing free-hanging masses whose position they measure directly using a laser interferometer. They also have two configurations, one with a bunch of source masses between the two hanging masses, the other with the source masses outside the hanging masses. The gravitational attraction of the 120kg source masses should pull the hanging masses either slightly closer together, or slightly farther apart, depending on the configuration, and this change of position is what they measure.

Their value is 6.672 34 ± 0.000 14 x 10-11 m3 kg-1 s-2, which has nice small error bars-- only the Eöt-Wash result is better in that regard-- but is way, way off from the other values. Like, ten times the uncertainty off from the other values. There's no obvious reason why this would be the case, though. If anything, the experiment is simpler in concept than any of the others, so you would expect it to be easier to understand. There aren't any really glaring flaws in the procedure, though (it never would've been accepted otherwise), so this presents a problem.

So, now what? Well, in the short term, this probably means that the CODATA value for G (the official, approved number used by international physics) will need to be revised to increase the uncertainty. This is kind of embarrassing for metrology, but has happened before-- a past disagreement of this type is one of the things that prompted the original Eöt-Wash measurements.

In the medium to long term, you can bet that every group with a bright idea about how to measure G is tooling up to make another run at it. This sort of conflict, like any other problem in physics, will ultimately need to be resolved by new data.

Happily, these experiments cost millions of dollars (or less), not billions, so we can hope for multiple new measurements with different techniques to resolve the discrepancy. It'll take a good long while, though, given how slowly data comes in for these types of experiment, which will give lots of people time to come up with new theories of what's really going on here.

Gundlach, J., & Merkowitz, S. (2000). Measurement of Newton's Constant Using a Torsion Balance with Angular Acceleration Feedback Physical Review Letters, 85 (14), 2869-2872 DOI: 10.1103/PhysRevLett.85.2869

Schlamminger, S., Holzschuh, E., Kündig, W., Nolting, F., Pixley, R., Schurr, J., & Straumann, U. (2006). Measurement of Newton's gravitational constant Physical Review D, 74 (8) DOI: 10.1103/PhysRevD.74.082001

Luo, J., Liu, Q., Tu, L., Shao, C., Liu, L., Yang, S., Li, Q., & Zhang, Y. (2009). Determination of the Newtonian Gravitational Constant G with Time-of-Swing Method Physical Review Letters, 102 (24) DOI: 10.1103/PhysRevLett.102.240801

Harold V. Parks, & James E. Faller (2010). A Simple Pendulum Determination of the Gravitational Constant Physical Review Letters (accepted) arXiv: 1008.3203v2

Find more posts in:

Physical Science

Share this: Facebook Twitter Stumbleupon Reddit Email + More


TrackBack URL for this entry: http://scienceblogs.com/mt/pings/145780



Nice writeup, and I'm ashamed to nitpick about grammar/spelling, but I believe you may have misspelled the word "Thang" in the title.

Posted by: Anonymous Coward | August 26, 2010 2:11 PM


Ha! It's nice to see you physicists have trouble too.

It looks as if different methods give different results (suggesting they all have different biases). How will you lot know which one is correct?

Also, would altitude make a difference?

Posted by: Bob O'H | August 26, 2010 2:43 PM


What usually happens in this sort of situation is that a few measurements using different techniques will turn out to agree with each other reasonably well, and that will come to be accepted as the "real" value. The data from all of these will be re-analyzed, and eventually somebody will find a plausible systematic effect that could've thrown the results off.

If I had to guess, I'd say that the Eot-Wash result is probably closer to the final value than the newer measurements. The period measurement seems to invite exactly the sort of weird twisting-wire effects that the original turntable measurements were designed to avoid, and the hanging-pendulum thing is, as far as I know, a very new technique, and the most likely to have some subtle problem that hasn't been noticed yet because people haven't been banging on it for decades like the torsion pendulum. That's just a guess, though.

Posted by: Chad Orzel | August 26, 2010 2:53 PM


Why do Eöt-Wash give "6.6742156" the extra digits "56" if, by their own estimates, the "2" might just as well be "1" or even "3"?

When I was an undergraduate, G was known to only three digits, so this is progress. I wonder about three-day experiments, though. At this stage they should be collecting measurements for several years before reporting.

Posted by: Nathan Myers | August 26, 2010 3:11 PM


Re: Nathan Myers

Re: the digits, I think there's a typo in Chad's writeup and I think you also lost a digit in your reading.

I think the numbers are:
with an error bar of
(times the appropriate units).

It's standard these days to give 2 digits of error bar, and write your number out to the same length, so I don't think there's anything weird here.

Re: "I wonder about three-day experiments, though. At this stage they should be collecting measurements for several years before reporting."

I disagree for two reasons.

First, precision measurement is partially about statistical error (which you can improve by taking data longer) and partially about checking for systematic errors (which don't improve through averaging). To check for systematics, typically one varies conditions and remeasures things to see how the measured value depends on stuff. To do this well, you may need to take data for almost as long under your "altered" conditions as you do when making them under your "good" conditions. And in most experiments, there are dozens of possible systematics to check. So, a good rule of thumb for a precision measurement experiment is that, when estimating what your statistical error, you allow for a day (or maybe a week) of data taking, so that you can do your systematic checks on the timescale of a couple of years. So I'm guessing that it's not that these guys are lazy or got bored; it's that they're responsible.

Secondly, if you look at their error budget in the paper, the statistical error accounts for 6 ppm error, and their total error is 14 ppm, so taking data for a few years would only make a small improvement to their error, reducing it from 14 to 12 ppm (assuming adding things in quadrature).

Posted by: Anonymous Coward | August 26, 2010 5:40 PM


If the exact and true value of G is discovered, creating a future time where we say "we should have seen that", it will be because it stands out in theory or is singularly distinctive in other context.

In the use of log tables, it was often necessary to extrapolate a value between those which were actually listed, by discerning the pattern of change and making a prediction. In multidimensional mappings of the values of fundamental constants expressed as common logs, recognizable as multidimensional log tables, there is one
single, unique concurrence of numerical pattern alignments which occurs for a single value of G :


Posted by: John Aikman | August 26, 2010 6:42 PM

via http://scienceblogs.com/principles/


New Computer Model Might Turn Theory Of Galaxy Formation On Its Head

Quantum Algorithms A Different View—Again

A view of quantum algorithms via linear algebra

via Gödel’s Lost Letter and P=NP


Problem-solving flow chart

NEW math on Futurama

post about the math from the Aug 19, 2010 episode of Futurama

via Computational Complexity

Facebook Follow: The Twitter-Eater, The Preemptive Google Me-Killer


So, Does Anyone Even Use All These Darn CPU Instructions?

Interesting research into the most commonly used functions in the IA32 instruction set.

via :: (Bloggable a) => a -> IO ()


Computational Complexity: What is the complexity of these problems and metaproblems?

via Computational Complexity

"The following problem is from Doctor Eco's Cyberpuzzles. I have shortened and generalized it.
We are going to put numbers into boxes. If x,y,z are in a box then it CANNOT be the case that x+y=z. If you put the numbers 1,2,...,n into boxes, what is the smallest number of boxes you will need?..."
view the rest of the post here:
Computational Complexity: What is the complexity of these problems and metaproblems?