*This is one of a series of posts about various aspects of my DIY weather station project YAWS. An overview of the project and links to all related posts are here. *

This post is focussed on the wind vane - how it works, how to connect it and (most importantly) how to get useful data out of it. Be warned that this journey takes us to some unexpected places.

### Which wind vane?

The wind vane I am using is a very common plastic unit that I suspect is readily available around the world. I got mine from Core Electronics, where it is part of their Weather Meters package.

### How does it work?

The data sheet for the unit gives a decent description of how it works. The base contains eight switches, each connected to a different strength resistor. The rotating head contains a magnet that can close one or two of these switches at any time. This gives 16 possible resistances, and hence 16 possible output voltages, corresponding to 16 directions (North, North North East, North East and so on).

So you connect it up, supply an input voltage, read the output voltage and convert it to the corresponding direction. Simple in theory, but it gets slightly trickier in practice.

### Connecting the wind vane

The wind vane is connected via an RJ11 cable and jack, so the first problem is connecting an RJ11 socket. Of course you could simply cut off the jack, strip back the wires and connect them directly, but I like the convenience of being able to plug and unplug the sensors.

RJ11 connectors (sockets) are pretty easy to find. The problem is that the pins on the connector are in two rows, each spaced at 0.1" but offset from each other by half that spacing. This means that they can not be plugged straight into a standard PCB / breadboard / prototyping board. The solution for me was a small breakout board that breaks out the six pins at 0.1" spacing.

Having got access to the pins, one is wired up to ground, the other to your operating voltage via a 10k resistor. The output voltage is then read on the sensor side of the resistor. If you are using an Arduino this can be read via an analogue input pin. This is definitely the easier option and one reason for my ultimately going with an Arduino as the station controller.

If you are using a Raspberry Pi you will need to put the output voltage through an Analog to Digital Converter (ADC) to get a useable output. I used an MCP3008 8-channel 10 bit ADC. This was easy enough to wire up and use, but just adds a bit of complexity.

### Reading the data

Having now got a digital representation of the output voltage we need to read and interpret it. The datasheet helpfully provides a table of the resistances and corresponding voltages assuming a 10k resistor and 5V input. If you are using a Raspberry Pi you will need to rework the output voltages for 3.3V input. This is simple enough to do as the voltage drop at the measurement point is determined by the ration of the resistance in the vane to the total resistance in the circuit, so:

Vout = Vin * R / (R + 10k)

where R is the resistance for the given direction (from the datasheet) and Vin is 3.3 for a Raspberry Pi. For example, for North R is 33k Ohms, so Vout is 2.53V on a Raspberry Pi.

The table will give you the voltages you expect for each direction, but I assume that there can be some noise in the signal so comparing the voltage against the exact value might not be reliable. To avoid this I used the value half way between each pair of voltages as the cutoff between one and the next. Unfortunately these values don't follow a neat progression, and are not in the same order as the directions. I ended up storing the cutoff voltages and corresponding direction numbers (0 = N, 1 = NNE etc) in a couple of arrays and using them to decode the voltage to a direction. In C++ for an Arduino this looks like:

// voltage cutoffs for the weather vane and corresponding directions const float vaneCutoffs[] = {0.365, 0.432, 0.536, 0.760, 1.048, 1.299, 1.693, 2.118, 2.590, 3.002, 3.254, 3.634, 3.939, 4.187, 4.474, VANE_VIN}; const int vanePoints[] = {5, 3, 4, 7, 6, 9, 8, 1, 2, 11, 10, 15, 0, 13, 14, 12};

### Really reading the data

So that gives us the ability to read the wind vane. As soon as you put the wind vane out in the open another challenge becomes apparent. If there is any sort of wind at all the vane moves around a lot, oscillating across a significant arc. Presumably it is swinging left and right of the "true" wind direction. A single reading is not going to tell you much about that underlying quantity that we are trying to measure.

My solution to this was to read the wind vane direction repeatedly over a short period of time and summarise this set of readings to provide one meaningful direction. I decided to read the wind vane every 200 milliseconds over a 10 second period, giving approximately 50 readings per observation. In my first iteration I used the most frequently counted direction (the mode) as the result.

Another point to note is that you also need some indication of wind speed. In particular if the anemometer is showing zero wind speed, the wind vane reading will be meaningless.

This all seemed to work ok. At first look the direction data I was getting seemed reasonable and to correspond to what was going on outside.

### The first problem

The first problem I noticed was that after a few days the wind directions seemed to be consistently wrong. For example it would report a steady southerly when it was pretty obviously a westerly.

Now it goes without saying (I hope) that when the wind vane is mounted outside it needs to be correctly physically aligned. The sensor itself has no knowledge of "north", just which position of the rotating head is to be reported as north. The instrument must be lined up so that that position corresponds to actual physical north.

I had done this when I set-up the weather station. The wind and rain sensors were mounted on a metal pole which was mounted on a star picket with cable ties. This seemed like a good system as it was easy to assemble and disassemble, so I could try it at various positions around the property.

While this is a fairly robust solution I had not noticed that in strong, gusty winds (welcome to spring at Orielton) the whole tube and sensor assembly could be slowly turned by the constant buffeting. This is what had happened and the wind vane was no longer correctly aligned. The solution was a quick re-alignment and a few turns of insulating tape or gaffer tape to provide enough stick to counteract the torque from the wind.

### The second problem

The second problem I found was more subtle, and demonstrates the importance of really looking at data critically and in detail. When I plotted out the direction data for a 24 hour period the result was "streaky".

In this example, there are lots of readings for west and periods of north-west and south-west, but virtually no readings of west north-west or west south-west. It doesn't seem credible that the wind would suddenly jump 45 degrees and avoid the intervening points.

Once I noticed this I took a harder look at the data and could see this as a consistent pattern. The number of reported observations for the eight intermediate points (NNE, ENE, ESE, SSE, SSW, WSW, WNW and NNW) were approximately 10% or less of the numbers reported for the other points (N, NE, E, SE, S, SW, W, NW). This is clearly an artefact of something in the measurement (physical sensor or in the code), not of the wind itself.

Looking at the design of the wind vane an obvious thought is that the more frequent directions are those that result from a single switch being closed. The less frequent are those that require the magnet to close two switches at once. Perhaps this design means that the intermediate points are harder to trip and are therefore under reported. This would mean that effectively each of the intermediate points represented less that 1/16th of the arc of the vane.

Another possibility struck me. Perhaps the sensor had a slight bias against the "two switch" positions and my method of using the modal value from a set was amplifying this bias.

To test this out I looked at the detailed readings that the station was using (i.e. the ~50 readings taken in a 10 second period). What I saw was the same sort of 10:1 ratio of single switch to double switch directions, both across multiple sets and within each set. From this I was happy to conclude that the problem was with the sensor significantly under-reporting the two switch directions.

### The solution

It might be a hardware problem, but short of buying a better sensor I couldn't think of a hardware fix. Perhaps there was something I could do in the software though. Clearly taking the modal value from the set of observations was a simplistic approach and does perpetuate the bias of the sensor. Perhaps some sort of average across the set would be better. After all if a set showed near equal counts for N and NE, it wouldn't seem unreasonable that NNE might be a better representation of the actual underlying wind direction than either N or NE.

I was representing the directions as integers in the range [0,15]. It quickly becomes obvious however that you can't simply average the set. For example, what is the average of one reading of 4 and one of 12 (East and West)? The glib answer would be 8 (South), but an equally valid answer would be 0 (North). Perhaps a better answer would be "Don't know" as a set of two diametrically opposite readings really doesn't give you any useful information.

This is the problem of "averaging circular quantities", and there is a beautifully simple and elegant way to deal with it. In the general case the approach is to map the circular quantity onto angles in the range -pi to pi radians, convert each angle in the sample to a pair of coordinates on a unit circle and take the average of these points. The vector from the origin to this point is the average you were looking for and you simply map this back to a value in your original domain. The other neat thing is that the length of this vector, which will be in the range [0,1] (as the average point must be within the unit circle), gives you a measure of how much the points were clustered. I am going to call this the clustering index.

For example, a clustering index of 1.0 means that all elements in the sample were the same (perfect clustering). A clustering index of 0.0 means the sample was perfectly distributed across the range (as in the S, E example above).

I appreciate that this may sound a bit opaque, but please bear with me, it really is pretty straight forward. Firstly the "map the circular quantity" business is simple because our circular quantity is already effectively an angle (a bearing in this case) and easily converted to values in the range -pi to pi radians. For an angle A, the corresponding point on the unit circle (i.e. a circle of radius 1 centred on the origin) has the coordinates (sin A, cos A). Averaging these points is just a matter of summing the x and y coordinates and dividing the sums by the sample size. The angle corresponding to the average point is returned by the atan2 function, and that angle can be mapped back to one of the 16 directions. The clustering index is simply the square root of the sum of the squares of the average point's coordinates (good old Pythagoras).

Below is a Python function to do this calculation for a set of direction counts. The functions angForDir() and dirForAng() simply translate directions to radian angles and back:

def circAvg(dirCounts, ciThreshold = 0.3): """ Returns a tuple of the average direction [0,16] and clustering index [0,1] for the sample set dirCounts where dirCounts[n] is the number of observations for direction n (n in [0,15]) Returns direction 16 (none) if the clustering index is <= threshold """ sampleSize = sum(dirCounts) if sampleSize == 0: return (16, 1.0) avgPoint = (sum(math.sin(angForDir(d)) * c for (d, c) in enumerate(dirCounts) if c > 0) / sampleSize, sum(math.cos(angForDir(d)) * c for (d, c) in enumerate(dirCounts) if c > 0) / sampleSize) ci = math.sqrt(avgPoint[0] ** 2 + avgPoint[1] ** 2) if ci <= ciThreshold: avgDir = 16 else: avgDir = dirForAng(math.atan2(avgPoint[0], avgPoint[1])) return (avgDir, ci)

If this is still feeling a bit obscure, perhaps a few examples in diagrammatic form will help. These diagrams show the results of this approach on a number of real wind direction sample sets. The blue circles represent the readings. The larger the circle the more readings for that direction. The red line shows the average, the direction of the line is the average of direction of the set, and the length of the line is the clustering index.

I think that these show that the average direction given by this method is a reasonable summary of the sample data. You can also see the correlation between the length of the line (the clustering index) and how clustered the sample data is (OK, "data are"). Intuitively there is a point at which the clustering is so low that the average becomes meaningless. My finger in the air guess is that a clustering index of less than 0.3 makes the result pretty meaningless.

When I ran this approach across a couple of hundred sets of actual data and compared the results against my initial modal method, approximately half of the samples got a different answer. These changes are, I think, improvements over the previous approach.

The wind direction chart below shows 24 hours of wind direction data using the averaging approach. The data no longer shows the streakiness evident in the previous chart. The data is (are!) still quite noisy in places, but I am more confident that this reflects the wind activity rather than artefacts of the sensor design. Or perhaps it would be more accurate to say that I think that this eliminates one artefact due to the sensor design.

### Where to next?

So after all of this, am I happy with the wind direction data that I am getting from YAWS? Not completely. The amount of variability seems very high, for example the period 10:00 to 14:00 in the chart above. I wonder whether these light weight plastic wind vanes are simply subject to large oscillations in moderate winds. Perhaps it is simply that I have located the weather station in a spot where the air flow is turbulent due to nearby trees and this data is an accurate representation of the activity.

I can think of a number of options for investigating this further:

- Try relocating the station to another site further from any obstructions. Unfortunately given the peculiarities of our property there is nowhere that I can easily place the station which would perfectly meet the rule of placing the station at a distance of at least twice the height of an obstacle away from that obstacle.
- Try positioning the wind vane (and anemometer) higher off the ground. They are currently about 2 metres off the ground. My understanding is that BOM places their wind instruments at about 3m (10 feet).
- Try modifying the wind vane hardware to damp down the oscillations, perhaps by adding some weight to the rotating head. I could see this just introducing new and exciting biases and artefacts to the data.
- Try different sampling and analysis methods to reduce the (suspected) noise.
- Try a completely different sensor technology. Ultimately I would love to build a two axis ultrasonic anemometer which would measure both wind speed and direction. This would have the advantage of eliminating mechanical artefacts from the measurement process.

For now I am probably going to leave the wind vane as it is for a period, and look at a different sensor design somewhere down the track (YAWS 2.0?).

### Conclusion

Who would have thought that there was so much to say about measuring the wind direction? This is a good example of why I have found YAWS such a rewarding project. Pretty much every aspect of the project is ripe with challenges and areas to explore and experiment.

Probably the main lesson I have taken from the wind vane experience is to not make assumptions about how a sensor will behave. I had assumed that as the wind vane could report 16 directions each would be equally accurately measured. I think that I have shown that this is not the case. As with any measurement it is important to look deeply at the data and the sensor, and try to understand exactly how the sensor works and how this may influence the data.