This repository has been archived by the owner on Jun 2, 2024. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 8
/
Copy pathweatherfile.py
299 lines (240 loc) · 11.7 KB
/
weatherfile.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
"""
WeatherFile module.
Read and parse weather data from a text file.
# Author - Christopher Teh Boon Sung
------------------------------------
"""
import re
from collections import OrderedDict
from annualweather import AnnualWeather
class WeatherFile(AnnualWeather):
"""
Weather file class to read and parse weather data from a given weather file.
Given a weather file (comma-delimited and plain text file format), this class loads the file,
reads, then parses the file content for the weather data. The weather data is then stored in a
dictionary instance attribute named `data`.
The weather file (in this case, we will name the file "weather.txt") must follow the
following format (in plain text file):
```text
# lines starting with hash (#) are comments (comments are optional)
# sample weather data in "weather.txt"
*doy,sunhr,tmax,tmin,rh,wind,rain
1,5.95,31.4,22,87,1.04,0
2,5.1,32.7,21.9,96,1.02,0
...
364,5.5,31.7,20.5,95,0.73,0.4
365,8.05,32,20.6,96,0.58,0
```
where the data are separated by commas. The first two rows of the file begins with the `#`
character, which denotes these two lines are comments. You can have as many comment lines in
a weather file as you like, so long as you start the line for each comment with `#`. Comments
are ignored by this class. Note: comments, if any, must only appear at the beginning of the
file. However, you cannot have comments within the weather data or at the end of the file.
After the comment lines, you will list down the weather data, where the data are separated
commas. The third line in the "weather.txt" above contains the column headers, where "doy" is
the day of year, "sunhr" is the sunshine hours, "tmax" and "tmin" are the maximum and minimum
air temperature, respectively, "wind" is the wind speed, and "rain" is the rainfall amount.
Column headers are mandatory because they indicate what kind of data are being stored.
Header names are user-defined and need not follow the example in "weather.txt". For example,
you can have column headers named "year", "month", and "day" to indicate the year,
month, and day, respectively. You can even use "rainfall" as the header for rain, instead of
"rain", as used in the "weather.txt" file example. Please note that header names are
case-sensitive, so column headers "rain", "Rain", and "RAIN" are not the same to one another.
Note the symbol `*` in front of "doy" column header in "weather.txt". This symbol `*`
indicates that "doy" is the key used to search for weather data. So, if we want to search for
the weather data when the day of year is 2 (doy==2), this class will return
"5.1,32.7,21.9,96,1.02,0" (see the fifth row in "weather.txt" above).
This class will read the file from top to bottom, so once we reached the end of the weather
data, the class will 'rewind' so that data search begins again from the top. So, after
searching the weather data for doy==365, and if the data have been exhausted, the next call to
search for the weather data for doy==1 will return "5.95,31.4,22,87,1.04,0" (see fourth row in
"weather.txt" above).
You can have as many keys as you want, so long as you precede the column headers with `*`.
In this example, we have created three keys:
```text
# "weather2.txt"
*year,*month,*day,doy,sunhr,tmax,tmin,rh,wind,rain
1990,1,1,1,5.95,31.4,22,87,1.04,0
1990,1,2,2,5.1,32.7,21.9,96,1.02,0
...
```
so that "year", "month", and "day" are keys that will be used to search for the weather data.
Keys do not need to be consecutively arranged or even begin the header line. For instance,
```text
# "weather3.txt"
sunhr,*doy,tmax,tmin,rh,wind,rain,*year
5.95,1,31.4,22,87,1.04,0,1990
5.1,2, 32.7,21.9,96,1.02,0,1990
...
```
where you now have two keys ("doy" and "year'), and these keys are not arranged consecutively
and do not start the header line.
Lastly, the weather data need not be for 1 year. You can place two or more years' worth of
weather, such as:
```text
# "weather4.txt"
*doy,sunhr,tmax,tmin,rh,wind,rain
1,5.95,31.4,22,87,1.04,0
2,5.1,32.7,21.9,96,1.02,0
...
364,5.5,31.7,20.5,95,0.73,0.4
365,8.05,32,20.6,96,0.58,0
1,2.1,31.4,22.1,100,0.75,1.3
2,4.15,31.9,22.1,91,0.74,50.6
...
1,5,31.1,19.9,91,0.64,3.6
2,3,31.5,20.2,96,0.71,32.7
...
```
Again, as mentioned earlier, if the weather data set have been exhausted, the class will
'rewind' the search back from the start. This is useful when you want to reuse the same
weather data set repeatedly in simulation runs (e.g., running a 20-year simulation time
with only one year of weather data set).
# ATTRIBUTES
alldata (list): List containing the weather data read from the weather file
# METHODS
total_years: Number of years for the entire weather data
total_datasets: Number of data sets for the entire weather data
load: Read the whole weather data from a provided weather file and store in `alldata`
update: Load one year of daily data and store them in weather table
"""
def __init__(self, wthrfile, nsets=365):
"""
Create and initialize the WeatherFile object.
# Arguments
wthrfile (str): weather file name and path
nsets (int): no. of data sets in a year
"""
self.alldata = None # stores the full weather data in a table
# internal use attributes:
self.__curdata = None # holds weather data for current day
self.__keys = None # the keys (one or more) used to find weather data
self.__vals = None # non-keys headers which will be used as weather fields
self.__pos = None # the current position in the full weather data storage
self.__headers = None # column headers obtained from weather file
self.load(wthrfile)
AnnualWeather.__init__(self, nsets, *self.__vals)
def _readline(self):
"""
Read the next set of weather data from the table.
# Returns
None:
"""
sz = len(self.__headers)
endpos = self.__pos + sz + 1
self.__curdata = OrderedDict(zip(self.__headers, self.alldata[self.__pos:endpos]))
self.__pos += sz
if self.__pos >= len(self.alldata):
self.__pos = 0
def _findkeys(self, char='*'):
"""
Identify which column headers read from the weather file are keys.
# Arguments
char (str): symbol or character to indicate key (default '*')
!!! note
There can be more than one key, and keys are identified by
headers marked with a symbol, '*' by default
# Returns
None:
"""
self.__keys = [] # reset/clear
self.__vals = []
n = 0 # start with the first header
bfound_at_least_one_key = False # need to find at least one key
while n < len(self.__headers):
bfound = self.__headers[n][0] == char
if bfound:
bfound_at_least_one_key = True # found at least one key
self.__headers[n] = self.__headers[n][1:] # rid the key identifier *
self.__keys.append(self.__headers[n])
else:
self.__vals.append(self.__headers[n]) # add to non-key list
n += 1 # find all keys until no more headers left
if not bfound_at_least_one_key:
# no keys found, so assume first column data is the key
self.__keys.append(self.__headers[0])
del self.__vals[0] # first column is taken as key, so delete it from non-keys
@staticmethod
def _type(strval):
"""
!!! note
`_type` is a static method.
Determine if a string can be converted into a number (float or integer).
Weather data in a file may not all be numbers (integers and floats). Dates, for
instance, can be a string, such as "12/1/1990" for Jan. 12, 1990. This function
checks and converts an object into an `int` or `float`, if it is possible. Otherwise, it
leaves the object as it is (no conversion).
# Arguments
strval (str): the string to be converted into a number (`int` or `float`)
# Returns
float/int/str: type returned after conversion
"""
try:
f = float(strval) # throws ValueError if the string cannot be converted into a number
if strval.find('.') >= 0:
return f # found a decimal point, so it is a float
else:
return int(strval) # no decimal point found, so assume it is an integer
except ValueError:
return strval # string cannot be converted into a number, so return it as-is
def total_years(self):
"""
Return the number of years for the entire weather data.
# Returns
int: number of years
"""
return len(self.alldata) / len(self.__headers) / self.nsets
def total_datasets(self):
"""
Return the number of data sets for the entire weather data.
# Returns
int: number of data sets
"""
return len(self.alldata) / len(self.__headers)
def load(self, wthrfile):
"""
Read the entire contents in a given weather file into memory for fast access.
Comments in the weather file are ignored and not stored. The weather file must be in
CSV and plain text format. However, the weather data can also be separated by semicolons.
# Arguments
wthrfile (str): the weather file name and path
# Returns
None:
"""
# remove all whitespaces (newlines, tabs, and spaces), then tokenize base on ',' or ';'
with open(wthrfile, 'rt') as fwthr:
line = ''
# ignore all blank and comment lines on top of file
while line == '' or line[0] == '#':
line = fwthr.readline().lstrip()
# first non-blank or non-comment line is encountered and assumed to hold the column
# headers, so remove all leading and trailing whitespaces, then tokenize based on
# comma or semicolon
self.__headers = [h.strip() for h in re.split(r'[,;]', line)]
# now identify which headers are the keys
self._findkeys() # keys are assumed those headers marked with '*'
self.alldata = [] # clear the weather table for new data
self.__pos = 0 # current read position in weather table is reset
# read in the entire weather file into memory (table)
for data in fwthr:
lst = [d.strip() for d in re.split(r'[,;]', data)]
self.alldata += [WeatherFile._type(strval) for strval in lst]
def update(self, year=0):
"""
Load one year of daily weather from the weather file.
Refresh the weather table to a specified year.
# Arguments
year (int): weather data set for which year number (>= 1) to load into weather
table. If `year` <= 0, the next successive year's weather data set
will be used.
# Returns
None:
"""
if year > 0:
nheaders = len(self.__headers)
ndata = len(self.alldata)
self.__pos = ((year - 1) * self.nsets * nheaders) % ndata
for iday in range(self.nsets):
self._readline()
for field in self.fields:
self.table[field][iday] = self.__curdata[field]