Brave New World^WWeather Applet
This is loosely related to AppletsRevisited
The goal, implement a server, weather.gnome.org, that hands out statically generated weather XML formed weather reports. Have a locations file available in multiple translations, similar to the existing Locations.xml.in that simply gives codes to retrieve XML data from the server.
Benefits
- When we want to add new weather locations they will be available to all users of GNOME,
- We can design the system so that we can add new data whenever we like, that old clients will ignore and new clients can take advantage of,
- We can write it in Python!!
- We can have a modularized architecture for talking to various weather sources, allowing us to add things to talk XML-RPC, other protocols, or simply screen scrape,
- Screenscraping will happen from a single source, x-thousand GNOME users won't all be doing it,
- Due to utilisation of other weather sources besides METAR we can get a higher resolution of locations (ie. places inside the city),
For cases where screenscraping is used, only weather.gnome.org would need updating when the source website changes, rather than requiring every GNOME user to upgrade to get weather data again.
Integration into other software, like Evolution. Current weather backend.
- others...
Disadvantages
- Single point of failure
FrankSolensky: This can be mitigated somewhat if the source is distributed in some manner; e.g.: using URNs. LuisVilla4: how is this any different than the current system? DavydMadeley: Luis, this is ambiguous. How is having a single point of failure any different, or how is this solution different? JamesHenstridge: the current weather applet has a single point of failure in the form of weather.noaa.gov (or similar). The benefit of having the weather applet grab data from a weather.gnome.org server is that we would have some control over the availability (eg. multiple servers with round robbin DNS). Also, if the server code is available distributors could easily run their own weather server if they don't want to rely on the availability of the Gnome web servers.
MarkTearle: Would Anycast hosted servers be a solution here? RoundRobin solutions don't necessarily solve Denial of Service problems.
JamesHenstridge: that paper seems to say that anycast isn't a good idea for TCP based protocols, since connections may suddenly get routed to a different anycast instance midstream. My point was that a weather.gnome.org service could be made reliable in the same way as other web sites can be.
- Such a weather server would be in no way GNOME-specific. If we create a reliable, easily-machine-parseable, free-as-in-beer source of weather forecasts, it will take about 12 minutes before there are Windows and Mac (and KDE, and PHP, etc) clients to access it as well. We would essentially be providing weather service for the entire Internet.
BryanClark: Perhaps we can take advantage of this and create a web site on weather.g.o (weather go, i like it!) that lists available windows, linux, Mac applications. Possibly create a weather service web application that's useful as well. Some Google Ads on the site could mitigate any burden or additional load we've generated.
offering a central server seems like an overstimation of our capabilities - whats about giving the user the opportunity to set his own source. This way the user could always get the most recent data for his location and there would not be any of the problems mentioned above. Sure this would include some kind of plugin interface for the different services, but this leads too much into design already. PavelRoitberg
Design
Ok, so what are the features of the current applet?
- Current weather
- Conditions
- Sky (clouds etc)
- Temperature
- Dew point
- Humidity
- Wind - direction and speed
- Pressure
- Visibility
- Sunrise/sunset (in CVS)
- Forecast - this is obtained differently for some locations, eg Australia and the UK
- Radar image - obtained from somewhere, or custom URL (again for Australia)
- Icon of the current conditions and temperature.
Need fields for acknowledgements for data use: eg requirement for using BOM radar maps
Things we want to add:
- Four Day Forecast (conditions and temperature)
- Climate information (??)
- Multiple (animated) RADAR images
DavidLodge: How about pollen count/UV index?
GarethOwen: Severe weather alerts (eg Flood warnings, Tornado alerts etc.)
MatsTaraldsvik: If there is a possibility of freezing rain. Not severe, but a general weather alert.
MatsTaraldsvik: The moon phase could be useful and easy to implement, as it can be calculated.
Format
XML, obviously. However, I think there are benefits to delivering it over Atom, using a custom namespace. Specifically, clients other than the weather applet can read it, the content model is easily extendable, it's designed for clients to check regularly, and we're already going to be using XML anyway. The only drawback I can think of is it will increase file size, and it people might demand an OPML (horrible, horrible format) file, but we can just have an xhtml page with atom autodiscovery links to every feed instead.
DavydMadeley: if anything I release uses OPML, several people have indicated an intention to fly to Australia and cause me pain
Details here and here. Basically we just include extra tags in our namespace in the atom:entry elements, with the current conditions and forecast in atom:content for normal feed readers.
So what is going to be in this XML namespace? Well, NOAA has XML with various attributes at http://www.nws.noaa.gov/data/current_obs/, example data file.
LuisVilla4: see also http://www.nws.noaa.gov/forecasts/xml/ which appears to be the new canonical way of getting US weather data, according to Wired: http://www.wired.com/news/technology/0,1282,65919,00.html?tw=wn_tophead_1
JamesHenstridge: In a previous development cycle, the weather applet was set up to download weather information in XML format from a site called weather.interceptvector.com. This code was backed out when the site disappeared. The schema for their format is available through the wayback machine though.
If using an XML format, it is probably worth trying to keep the data as small as possible, and as cache-friendly as possible. This might include using a gzip transfer encoding, and setting appropriate expiry times on the files served (eg. if the weather data for a particular city only updates once every hour, then tell the client so it doesn't try to reload the data every 15 minutes). If forecast data is issued once or twice a day, but the current observation is updated every 15 minutes, then it might make sense to serve them as separate resources.
Similarly, the weather applet should remember the etag/mtime of its previous request so that it can revalidate it's current data rather than redownload the same data. This would probably involve not using gnome-vfs as an HTTP client ...
JamesAndrewartha: Looking at the weather.xsd schema, the things it's missing compared to the NOAA data are lat/long/elevation (although we'd serve sunset/sunrise calculated from these on the server anyway), windchill, heatindex and dewpoint. Separating out the forecast is also a good idea, for your reason but also because observations and forecasts can have different coverage areas (eg I can get Swanbourne observations, but the forecast is for the Perth region). So for each atom entry we'd have a <atom:link rel="observation" href="whatever"> and a <atom:link rel="forecast" href="whereever">.
FrankSolensky: (serve sunrise/set calculated from server): Calculating them on the server and downloading them might put more of a CPU load than needed -- might it be better to load the lat/lon parameters when the location is selected and calculate the times locally?
DavydMadeley: since all data is cached, this will not be an issue IMHO. If required, we can calculate this data in idle periods, a year in advance if required. Offloading this onto the server might enable us to utilise some really nifty calculation utilities like sunrisenset by Geoscience Australia. Which actually takes account of things we can't such as approximate air refraction. It can also calculate civil twilight and moon rise and set times.
Locations
This leads into the format for the locations database. Basically I think if you have a tree of atom files, with region's file being a bunch of atom:entry elements for each subregion having an atom:link to that subregion's atom file everything is fairly sane. The leaves on the tree are atom:entry elements with atom:link elements as above. Also have an attached xsl transform so that people can easily browse using a web browser, and courtesy elements like <atom:content><a href="whereever">Weather data for whatever region</a></atom:content>.
Radar images - just a link to the hosted site as is currently done?
Ok, I have basic working code to generate a tree of atom files from Locations.xml.in. Localisation can be handled later, as can the question of where alphabetisation should be done. I still need to inherit zone and radar attributes.
This leads into the question of how the parsers get the location information - currently I'm thinking some sort of pickled text dict for each data source.
Data Sources
We have several sources to gather data:
- METAR - gweather current supports this, but the parser could do with rewriting to make it more flexible
- Do we want to use the current C parser or rewrite in python?
DavydMadeley: rewrite in Python. I was fiddling around with it the other day, trying to come up with a new design for the existing METAR parser to make it more flexible. The current parser is very complex and hard to debug, so it definitely needs rewriting.
TAF - forecast data, gweather doesn't support this. Available from NOAA eg. TAF for Perth - Useful parsers for TAF do exist: forecast for YPPH
New XML format indicated by LuisVilla above.
Australian Bureau of Meterology provides a number of data files:
SILO Climate and Rainfall data files containing observations from every station in Australia for a particular day. Easily parsed with any text processing tools.
Radar images for rainfall in capital cities.
Current observations, eg. for Perth. This only seems to be available as HTML now, but seems to have easily parseable comments like the following:
<!-- HEAD, Time, Name, Temperature , Dewpoint , Relative Humidity , Wind Direction, Wind Speed, Wind Gust, Pressure, Rain Since 9am, Min Temperature, Time of Min, Max temperature, Time of Max, Direction of Max Gust, Max Gust, Time of Max Gust --> <!-- UNIT, date, null, celsius, celsius, %, 16_point, km/h, km/h, hPa, mm, celsius, date, celsius, date, 16_point, km/h, date --> <!-- DATA, 20050117:161100, PERTH METRO, 33.6, 4.0, 15.6235278594697, SW, 22, 35, 1012.8, 0, 18.8, 20050116:233900, 39.5, 20050117:132800, E, 25, 20050117:020900 -->
Forecasts. eg. for Perth via HTTP, or via FTP, raw. Forecasts are provided as plain text, but the temperature forecasts could probably be extracted with regexps fairly reliably.
JohnClarke: They also provide aviation weather reports (TAFs, TTFs, METARs etc). Login is required, but the username and password is published on http://www.bom.gov.au/reguser/by_prod/aviation/. I haven't been able to find separate METARs for each location, but only the aggregate page for each area. The areas are the aviation areas defined by Air Services Australia (for example, Sydney is in area 20, Perth is in area 60).
- * others...
- Local weather information sources (BoM, the UK one).
DavydMadeley: Do places like BoM have an easily parsable weather observations/forecast format, or are we going to have to screen scrape?
- seems the answer is yes, although you have to mail them about it. Not sure if there is an associated charge.
DavidTrowbridge: In the USA, NWS stations now provide forecasts in the fairly easily parsable CCF format
DavidLodge: The UK Met Office runs a service "Data Products Distribution System" though it seems to charge quite large amounts, see http://www.met-office.gov.uk/wfc/. The page http://www.met-office.gov.uk/education/archive/uk/ holds an archive of the last thirty days of data, running one hour behind; which seem to be in a parsable format (HTML tables), but using this data directly may be against the terms and conditions.
- weather.com? currently used for radars I think.
http://www.rssweather.com - Might be useful for some locations
JamesHenstridge: unless rssweather contains data not available elsewhere, it doesn't offer much value over screen scraping other websites. It just provides weather as HTML wrapped in RSS, rather than in an easy machine readable format.
BenGoodger: the BBC provides RSS feeds of much of their content, as well as a number of APIs, making the content accessible through those APIs free for non-commercial use. Properly auto-mangled into GWeatherXML and Atom, using this could increase the number of locations available in the UK to many thousands. It'd also mean that my location had weather capability.
http://www.wxqa.com/ Perhaps also support for data from the Citizen Weather Observer Program which would give more data points than METAR data, probably more current and local.
G!SMETEO.ru Russian service covering Russia, Europe and bigger world cities; there's also a WAP and PDA service which might be way easier to screen-scrape (though they don't provide the full 10-day forecast like the main service does)
yr.no Norwegian service providing forecasts for 700,000 places in Norway and 6.3 million places worldwide. « yr.no is unique in Europe because of our very detailed weather forecasts and our free data policy. » They provide data in Norwegian Bokmål, Norwegian Nynorsk and English (XML (link), PHP (link) and GRIB (link)). There are also APIs for radar images (link) , and for 'ExtremesWWC' - Wettest, Warmest and Coldest places (link).
MatsTaraldsvik: This service is very accurate, and covers more than 700,000(!) places in Norway and 6.3 million worldwide. The fact that it has a free data policy, and provides feeds in various formats, should make it easy to use.