Calculate Missing Values: Difference between revisions

m
→‎CreateMissing.exe: That sentence still read wrongly!
m (→‎CreateMissing.exe: That sentence still read wrongly!)
(2 intermediate revisions by the same user not shown)
==CreateMissing.exe==
 
This utility is only for those who have already installed Cumulus MX:
This is only available if you already have MX installed and have been using it. Put simply, this utility updates [[Standard_log_files#Number_of_fields_per_line_varies_by_release|MMMYYlog.txt]] lines as it works through that file to work out daily extremes in order to recreate [[dayfile.txt]].
* It uses some files that are in the MX installation package
* It uses some files that are created when MX is run
 
These notes relate to "Create Missing Version 1.1.0" which is for "Cumulus MX release 3.12.0" and above. At the time of amending this section Create Missing version 1.0.2 had been tested and was found to have a bug.
 
Download''Please linknote atthe [[Software#Create_Missing]]developer page, unzip,does andnot installfully indescribe same folder as CumulusMX.exe. The read me for thishis utility program is at [https://github.com/cumulusmx/CreateMissing/blob/master/README.md. his github page] so the author of this Wiki update cannot guarantee the documentation here is correct''
 
This utility was written by Mark Crossley specifically to insert any missing data:
Check you do not have a file called '''dayfile.txt.sav''' in your [[Data folder|data sub-folder]]. If such a file exists, the utility program will not run.
* It renames any existing [[dayfile.txt]] to '''dayfile.txt.sav''' in your [[Data_folder|data sub-folder]]
** The utility will not run if a file called '''dayfile.txt.sav''' already exists in data folder, the utility is intended to be used just once
* If any of the following fields are not populated in [[Standard_log_files#List of fields in the file|MMMYYlog.txt]] lines, the utility can populate them as it works through that file to work out daily extremes (see calculations in table below).
** [[Wind chill|Wind Chill]]
** [[Apparent_temperature|Australian Apparent Temperature]]
** [[Feels Like|Feels Like temperature]]
** [[Heat index|North American Heat Index]]
** [[Temperature_(and_humidity)_measurement#Dry_and_Wet_Bulb|Wet Bulb temperature]]
** [[Temperature_(and_humidity)_measurement#How_Cumulus_software_handles_Temperature_and_Humidity|Dew Point]]
* The utility creates a new ''dayfile.txt'' populating each field as summarised in table below
 
Run the utility by changing directory to the folder where you installed it:
* On Linux operating systems, you need execute rights in that folder (prefix with '''sudo''' if you don't have rights) type <code>mono CreateMissing.exe</code>.
* On Microsoft Windows operating systems, type <code>CreateMissing</code> in a command window, Powershell window, or Terminal window (whichever is available when you right click in the folder or on the "Start" icon.
 
===Obtaining the Create Missing Utility===
The utility program, CreateMissing.exe, can be run while MX is left running (except at rollover time when MX writes to dayfile.txt, this includes any time while MX is doing "catch-up" and therefore can do rollovers for past days), but as MX only reads the dayfile.txt as MX starts up, any changes this utility makes will not be picked up by MX until MX is stopped and restarted.
 
There is a download link on [[Software#Create_Missing]] page, unzip to reveal the components in the package, and install all of those in same folder as CumulusMX.exe.
This utility program, CreateMissing.exe, looks in [[Cumulus.ini]] for:
# The Cumulus start date in "StartDate=" parameter, which defaults to the date you first ran Cumulus (although it can be edited to another date, such as when you imported earlier data or moved to a new home after you first used Cumulus). That will be the earliest date the utility program processes.
#* However, if a dayfile.txt file exists and has an earlier date, then a prompt will ask you if you want to use the earlier date or the "StartDate=" date. If you answer "Y" to use earlier date, the utility program will continue, starting at that earlier date. If you answer with anything else, the utility program will exit.
#* The utility program will also check the start date against the current date, and will exit unless you have more than one day completed since you started using MX.
# The meteorological day start time in "RolloverHour=" and "Use10amInSummer=" parameters. This identifies which standard log lines belong to each day by checking against date and time of that line.
# Units and Number of decimal places associated with temperature, wind, wind average, rainfall, pressure, Ultra Violet, sunshine, evapotranspiration, wind run, and temperature trend.
# Thresholds for Heating Degree Days, Cooling Degree Days, and chill hours.
# station type (needed to determine which source values are available)
# month when Chill hours season starts
# options (use zero bearing, use 10 minute wind average, use speed or gust for wind average calculation, fix maximum humidity, read or calculate dew point, read or calculate wind chill, synchronisation time, , set clock option, read or calculate pressure trends, log extra sensors or not, ignore station clock or not, round wind speeds or not, air quality sensor or not, check that minimum number of sensors are updating or not, time for averaging wind bearings, time for averaging wind speed, and time for peak gust)
 
If you are installing it into a UNIX environment (e.g. a computer running Linux or Raspberry Pi operating system), the '''CreateMissing.exe''' file may need to be given execute access (see [[MX_on_Linux#chmod]])
The utility program, CreateMissing.exe, uses the same module to calculate derived values (like dew point, wind chill, apparent temperature, feels like, and Humidity Index) from source values as CumulusMX.exe. This is how it is able to [[Standard_log_files#Number_of_fields_per_line_varies_by_release|insert missing fields in the standard log files]]. As it processes each line in any standard log file, if a particular field is missing, it will calculate it from other fields in the line if it can. The utility program, CreateMissing.exe, can update the various [[:Category:Ini Files]] with any new extremes found while it is calculating new fields for any standard log file.
 
Please note that "Create Missing Version 1.1.0" can only be used with "Cumulus MX release 3.12.0" and above. It makes use of components not included in earlier MX releases.
The utility program,CreateMissing.exe, looks in the [[Data folder|data sub-folder]] to see if dayfile.txt exists (if it does, its file name has a '''.sav''' suffix added after the ''.txt''). If a file with .sav suffix already exists, the utility program will stop, as it cannot create a file if it already exists.
 
Earlier "Create Missing" versions can be found at https://github.com/cumulusmx/CreateMissing, and these will work with earlier MX releases, but be aware that Create Missing version 1.0.2 had been tested and was found to have a bug.
As the utility program, CreateMissing.exe, reads each line in any [[Standard log files|standard log file]] it is processing, it works out from the date and time in that line which meteorological date to assign that line to, using the rollover time and use 10am in summer settings. Any day with less than 5 standard log file lines is ignored for dayfile.txt, as is current day.
#At the start of each day, any maximum field in the output file (that is created as [[dayfile.txt]]) is set to an extreme negative, any minimum field in the same file is set to an extreme maximum, any daily total field is set to zero, and the cumulative chill hours field is set (either to the figure for the day before, or reset to zero for first day of new season).
# As each standard file line in that meteorological day is processed, any maximum field in the output file is updated if the read figure is higher than previous figure, any minimum field in the same file is updated if the read figure is lower than the previous figure, the total rainfall calculation changed between versions 1.0.0 and 1.0.1 (basically rain total is taken from the final line per day in the standard log file field 9)
# For fields like Heating Degree Days and Cooling Degree Days, CreateMissing.exe is tracking the temperature field 2 against the threshold it has read from Cumulus.ini, so if the degree day figure is missing, the utility is able to insert a figure based on the difference between temperature and threshold and the time since previous log (for heating degree days it is heating threshold minus temperature times minutes since last previous divided by 1440; for cooling it is temperature minus cooling threshold times minutes divided by 1440).
# For total sunshine, day starts just after midnight, and ends next midnight; even if that is not rollover time; the figure stored in dayfile.txt is stored against the meteorological day that applies at 1 minute past midnight. Basically if rollover time is not midnight, the sunshine on a particular calendar day is assigned to previous date in dayfile.txt.
# The average temperature added to dayfile if it is not already in existence, is based on the same approach as used in [[today.ini]], i.e. the utility maintains a cumulative total of time passed for the meteorological day, and a cumulative total incremented as each standard log file line is processed which is an increment of "minutes_since_previous_line multiplied by (latest_temperature + last_temperature) divided by 2".
# Chill hours output is incremented from existing figure by number of minutes since last line divided by 60, if the temperature field is below the threshold.
# Tracking the maximum rainfall in last hour figure for a day is obviously a bit complicated, as the utility has to work on a running hourly rainfall total (total rainfall on line being processed minus total rainfall on line for one hour ago).
 
===Running the Create Missing Utility===
The utility program will output diagnostic messages both to any terminal session open, and to a file saved in [[MXdiags_folder|MXdiags sub-folder]].
 
The utility program can be run while MX is left running (except at rollover time when MX writes to dayfile.txt, this includes any time while MX is doing "catch-up" and therefore can do rollovers for past days).
Here is a short section of typical output (there will also be messages when each log file is opened or finished with):
 
To run CumulusMX.exe needs .NET or MONO to be installed, and because MX is already running, CreateMissing.exe can be run, as it too requires .NET or MONO.
 
# Open up the MX interface in a browser, and navigate to Settings menu -> Station Settings -> General Settings -> Advanced Options -> Records Began Date
#* The date that is shown there is the date where "Create Missing" will start by default, so if you have MMMYYlog.txt log files with an earlier date, edit the date here, keeping to same format
#* Click '''Save Settings''' button if you have made a change to this date
# Close your browser, and (if using an interactive screen for your computer) open up your file manager, or (if using a terminal session for access to your computer) navigate to your CumulusMX folder
# Navigate to your [[Data_folder|data sub-folder]]
# If there is a file there called '''dayfile.txt.sav''', rename that file to '''dayfile.txt.sav.bak''' (or any other name that does not already exist)
# Now change directory back up to parent folder <code>cd ..</code>, i.e the folder containing "CreateMissing.exe"
#* On Linux operating systems, you need execute rights on that file (prefix with '''sudo''' if you don't have rights) type <code>mono CreateMissing.exe</code>.
#* On Microsoft Windows operating systems, type <code>CreateMissing</code> in a command window, Powershell window, or Terminal window (whichever is available when you right click in the folder or on the "Start" icon.
 
Remember, as MX only reads the dayfile.txt as MX starts up, any changes this utility makes will not be picked up by MX until MX is stopped and restarted.
 
===How the Create Missing Utility works===
 
The utility program will output to any terminal session open and to a file saved in [[MXdiags_folder|MXdiags sub-folder]].
 
This utility program looks in [[Cumulus.ini]] for:
# The Cumulus start date in "StartDate=" parameter, which defaults to the date you first ran Cumulus (as mentioned above it can be edited to another date, to include imported earlier data or to exclude data that relates to a former location).
#* That will be the earliest date the utility program processes.
#* However, if a dayfile.txt file exists and that has an earlier date, then "Create Missing" will only continue if you accept that earlier date.
# The meteorological day start time in "RolloverHour=" and "Use10amInSummer=" parameters.
#* This identifies which standard log lines belong to each day by checking against date and time of that line.
# The thresholds for Heating Degree Days, Cooling Degree Days, and Chill Hours
# The starting month for Chill Hours Season
 
This utility program looks in the [[Data_folder|data sub-folder]]:
# If there is a file there called '''dayfile.txt.sav''', the utility aborts
# If there is a file there called '''dayfile.txt''', the utility renames that to ''dayfile.txt.sav''
# The utility creates an empty file, naming it '''dayfile.txt'''
# The utility continues by opening the MMMYYlog.txt file for the earliest date (see above) it is going to process
#* See table below for how contents of this file are read/updated and used to create lines in the new '''dayfile.txt''' file
# Each subsequent MMMYYlog.txt file is processed in turn
 
===How the utility reports progress===
 
Here is a short section of typical output (from version 1.0.2 that had a bug and never processed 1st day of month):
<pre>
2021-06-08 19:35:44.108 Loading log file - data/Jul20log.txt
2021-06-08 19:35:44.191 01/07/2020 : No monthly data was found, not updating this record
2021-06-08 19:35:44.688 Date: 02/07/2020 : Adding missing data
2021-06-08 19:35:44.705 Date: 03/07/2020 : Adding missing data
2021-06-08 19:35:45.006 Date: 28/07/2020 : Entry is OK
2021-06-08 19:35:45.006 Date: 29/07/2020 : Entry is OK
2021-06-08 19:35:45.007006 Date: 30/07/2020 : Entry is OK
2021-06-08 19:35:45.141 Date: 31/07/2020 : Adding missing data
2021-06-08 19:35:45.156 Finished processing log file - data/Jul20log.txt
2021-06-08 19:35:45.156 Loading log file - data/Aug20log.txt
</pre>
 
 
 
=== How the utility creates a dayfile.txt line ===
{| class="wikitable" border="1"
|-
!style="background:pink; width:250px" | dayfile.txt field
|colspan="4" style="background:lightgray; width:150px" | Standard log file fields
!style="background:pink; width:400px" | Description
|-
|style="background:pink;"| Daily derivative
|style="background:lightgray;"| Preferred field
|style="background:lightgray;"| First source
|style="background:lightgray;"| Second source
|style="background:lightgray;"| Third source
|style="background:pink;"| (how calculated)
|-
| [[Meteorological_day|date]]
|
| Day-Month-Year
| Hour-Minute
|
| From processing lines linked with that Meteorological day.
|-
|Highest wind [[Wind_measurement#Weather_Stations_and_Cumulus|gust]] speed
| colspan="4" | Cumulus '''Gust''' wind speed
| Stores highest value of that log file field in that Meteorological day.
|-
|[[Wind_measurement#Wind_Direction | Bearing]] of highest wind gust
| colspan="4" | Average wind bearing (in degrees)
| Stores the bearing recorded at same time as maximum value in previous field
|-
|Time of highest wind gust
| colspan="4" | Hour-Minute
| Stores the time in log file line used in two previous fields
|-
|Minimum [[Temperature_(and_humidity)_measurement#Cumulus_Calculated_Parameters | temperature]]
| colspan="4" | Current temperature
| Stores the lowest value of that log file field in that Meteorological day.
|-
|Time of minimum temperature
| colspan="4" | Hour-Minute
| Stores the time in log file line used in the previous field
|-
|Maximum temperature
| colspan="4" | Current temperature
| Stores highest value of that log file field in that Meteorological day.
|-
|Time of maximum temperature
| colspan="4" | Hour-Minute
| Stores the time in log file line used in the previous field
|-
|Minimum [[Pressure_Measurement | sea level pressure]]
| colspan="4" | Current sea level pressure
| Stores the lowest value of that log file field in that Meteorological day.
|-
|Time of minimum pressure
| colspan="4" | Hour-Minute
| Stores the time in log file line used in the previous field
|-
|Maximum sea level pressure
| colspan="4" | Current sea level pressure
| Stores highest value of that log file field in that Meteorological day.
|-
|Time of maximum pressure
| colspan="4" | Hour-Minute
| Stores the time in log file line used in the previous field
|-
|Maximum [[Rain_measurement#Rain_Rate | rainfall rate]]
| colspan="4" | [[FAQ#How_is_my_rain_rate_calculated.3F | Current rainfall rate]]
| Stores highest value of that log file field in that Meteorological day.
|-
|Time of maximum rainfall rate
| colspan="4" | Hour-Minute
| Stores the time in log file line used in the previous field
|-
|Total rainfall for the day
| colspan="4" | Total rainfall today so far
| Stores the entry in the last log file field in that Meteorological day.
|-
|[[Average temperature]] for the day
|
| Hour-Minute
| Current temperature
|
|Loop through every log file pair of fields in that Meteorological day:
# Work out interval time in minutes obtained by subtracting previous "Hour-Minute" field from current "Hour-Minute" field
# Work out product of above interval time times "Current temperature" field
# Sum the interval times in step 1 for whole day
# Sum the products in step 2 for whole day
# When completed loop, store the sum in step 3 divided by the sum in step 4
|-
|Daily [[Windrun | wind run]]
|
| Hour-Minute
| Cumulus moving ''''Average'''' of wind speed measurements over a particular period
|
|Loop through every log file pair of fields in that Meteorological day:
# Work out interval time in hours obtained by subtracting previous "Hour-Minute" field from current "Hour-Minute" field
# Work out product of above interval time times "Current average wind speed" field
# Sum the products in step 2 for whole day
# When completed loop, store the sum in step 3
|-
|Highest [[Wind_measurement#Weather_Stations_and_Cumulus|Average Wind Speed]]
| colspan="4" | Cumulus moving ''''Average'''' of wind speed measurements over a particular period
| Stores highest value of that log file field in that Meteorological day.
|-
|Time of Highest Avg. Wind speed
| colspan="4" | Hour-Minute
| Stores the time in log file line used in the previous field
|-
|Lowest [[Temperature_(and_humidity)_measurement | humidity]]
| colspan="4" | Current [http://en.wikipedia.org/wiki/Relative_humidity relative humidity]
| Stores the lowest value of that log file field in that Meteorological day.
|-
|Time of lowest humidity
| colspan="4" | Hour-Minute
| Stores the time in log file line used in the previous field
|-
|Highest humidity
| colspan="4" | Current relative humidity
| Stores highest value of that log file field in that Meteorological day.
|-
|Time of highest humidity
| colspan="4" | Hour-Minute
| Stores the time in log file line used in the previous field
|-
|Total evapotranspiration
| colspan="4" | Evapotranspiration
| Stores highest value of that log file field in that Meteorological day.
|-
|Total hours of sunshine
| colspan="4" | Hours of sunshine so far today
| Stores highest value of that log file field in that '''calendar''' day (i.e. midnight to midnight)
|-
|High USA [[Heat index]]
| Heat Index
| Current relative humidity
| Current temperature
|
| The heat index is a derived value, if the field in "preferred field" does not contain a valid number, then that field is populated for each line linked with that Meteorological day using the values in fields named in the the other columns of this table. When all the preferred field in day have a value, the highest is stored.
|-
| Time of high heat index
| colspan="4" | Hour-Minute
| Stores the time in log file line used in the previous field
|-
| High [[Apparent temperature]]
| Apparent temperature
| Current relative humidity
| Current temperature
|
| Apparent temperature is a derived value, if the field in "preferred field" does not contain a valid number, then that field is populated for each line linked with that Meteorological day using the values in fields named in the the other columns of this table. When all the preferred field in day have a value, the highest is stored.
|-
|Time of high apparent temperature
| colspan="4" | Hour-Minute
| Stores the time in log file line used in the previous field
|-
|Low apparent temperature
| Apparent temperature
| Current relative humidity
| Current temperature
|
| Apparent temperature is a derived value, if the field in "preferred field" does not contain a valid number, then that field is populated for each line linked with that Meteorological day using the values in fields named in the the other columns of this table. When all the preferred field in day have a value, the lowest is stored.
|-
|Time of low apparent temperature
| colspan="4" | Hour-Minute
| Stores the time in log file line used in the previous field
|-
|High hourly rain
| colspan="4" | Total rainfall today so far
| High hourly rain is a derived value. Loop through every log file field in that Meteorological day, build up a series of hourly values (total rainfall in this entry minus total rainfall an hour earlier), find maximum of all those hourly values, and store that.
|-
|Time of high hourly rain
| colspan="4" | Hour-Minute
| Stores the time in log file line used in the previous field
|-
|Greatest [[wind chill]] (high wind speed, low temperature)
| Wind chill
| Cumulus moving ''''Average'''' of wind speed measurements over a particular period
| Current temperature
|
| Wind Chill can be reported by weather station or it can be derived. If the field in "preferred field" does not contain a valid number, then that field is populated for each line linked with that Meteorological day using the values in fields named in the the other columns of this table. When all the preferred field in day have a value, the highest is stored.
|-
|Time of greatest wind chill
| colspan="4" | Hour-Minute
| Stores the time in log file line used in the previous field
|-
|High [[Temperature_(and_humidity)_measurement#Cumulus_Calculated_Parameters | dew point]]
| colspan="4" | Current dew point
| Dew Point can be reported by weather station or it can be derived. However, all Cumulus releases have this log file field. Stores highest value of that log file field in that Meteorological day.
|-
|Time of high dew point
| colspan="4" | Hour-Minute
| Stores the time in log file line used in the previous field
|-
|Low dew point
| colspan="4" | Current dew point
| Dew Point can be reported by weather station or it can be derived. However, all Cumulus releases have this log file field. Stores lowest value of that log file field in that Meteorological day.
|-
|Time of low dew point
| colspan="4" | Hour-Minute
| Stores the time in log file line used in the previous field
|-
|Today's dominant/average wind direction
|
| Cumulus moving ''''Average'''' of wind speed measurements over a particular period
| Average wind bearing (in degrees)
|
| The dominant/average wind direction is a derived value.
# Loop through every log file pair of fields in that Meteorological day:
#* Calculate increment in X as product of wind speed times sine of bearing, and sum those increments
#* Calculate increment in Y as product of wind speed times cosine of bearing, and sum those increments
# Convert final X and Y coordinates back to a bearing in degrees
|-
|[[Heat/cold degree days and Chill hours | Heating degree days]] (HDD)
|
| Hour-Minute
| Current temperature
|
|Loop through every log file pair of fields in that Meteorological day:
# Work out interval time in days obtained by subtracting previous "Hour-Minute" field from current "Hour-Minute" field
# Work out increment in HDD by subtracting current temperature from HDD threshold
# Work out product multiplying result in step 1 by result in step 2, and sum those products
# At end of loop store the final sum
|-
|[[Heat/cold degree days and Chill hours | Cooling degree days]] (CDD)
|
| Hour-Minute
| Current temperature
|
|Loop through every log file pair of fields in that Meteorological day:
# Work out interval time in days obtained by subtracting previous "Hour-Minute" field from current "Hour-Minute" field
# Work out increment in HDD by subtracting CDD threshold from current temperature
# Work out product multiplying result in step 1 by result in step 2, and sum those products
# At end of loop store the final sum
|-
|High solar radiation
| colspan="4" | current solar radiation
| Stores highest value of that log file field in that Meteorological day.
|-
|Time of high solar radiation
| colspan="4" | Hour-Minute
| Stores the time in log file line used in the previous field
|-
|High UV Index
| colspan="4" | UV Index
| Stores highest value of that log file field in that Meteorological day.
|-
|Time of high UV Index
| colspan="4" | Hour-Minute
| Stores the time in log file line used in the previous field
|-
|High [[Feels Like]] temperature
| Feels Like temperature
| Current relative humidity
| Cumulus moving ''''Average'''' of wind speed measurements over a particular period
| Current temperature
| Feels Like temperature is a derived value, if the field in "preferred field" does not contain a valid number, then that field is populated for each line linked with that Meteorological day using the values in fields named in the the other columns of this table. When all the preferred field in day have a value, the highest is stored.
|-
|Time of high feels like temperature
| colspan="4" | Hour-Minute
| Stores the time in log file line used in the previous field
|-
|Low Feels Like temperature
| Feels Like temperature
| Current relative humidity
| Cumulus moving ''''Average'''' of wind speed measurements over a particular period
| Current temperature
| Feels Like temperature is a derived value, if the field in "preferred field" does not contain a valid number, then that field is populated for each line linked with that Meteorological day using the values in fields named in the the other columns of this table. When all the preferred field in day have a value, the lowest is stored.
|-
|Time of low feels like temperature
| colspan="4" | Hour-Minute
| Stores the time in log file line used in the previous field
|-
|High Canadian Humidity Index or [[Humidex]]
| Humidex
| Current relative humidity
| Current temperature
|
| The Canadian Humidity index is a derived value, if the field in "preferred field" does not contain a valid number, then that field is populated for each line linked with that Meteorological day using the values in fields named in the the other columns of this table. When all the preferred field in day have a value, the highest is stored.
|-
|Time of high Humidex
| colspan="4" | Hour-Minute
| Stores the time in log file line used in the previous field
|-
| Cumulative Seasonal Chill Hours
|
| Current temperature
| Hour-Minute
|
| "Chill Hours" is a derived value, loop through every log file field in that Meteorological day:
# Work out interval time in hours obtained by subtracting previous "Hour-Minute" field from current "Hour-Minute" field
# Work out if there is increment in Chill hours by seeing if "Current temperature" field is below Chill Hours threshold temperature
# If there is an increment, sum value from step 1
# At end of loop, store final value of sum after (except on first day of month specified as Start of Chill hours season) adding it to value in previous day
|}
 
==Using a PHP script on your web server==
5,838

edits