Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Trade-offs between database column values and insertion of new sensors #127

Open
carlosparadis opened this issue Dec 18, 2019 · 3 comments
Assignees

Comments

@carlosparadis
Copy link
Member

This summarizes a point of concern about our meeting today for new sensors added in the future.

Problem

Currently, in the database schema created by init_database.py, one column is created with enums:

egauge.script.orm_egauge.setup()

(@matthew-schultz I didn't realize you were using egauge orm here. Since we are moving this to a single file it's not too bad but if we didn't make in time it would cause confusion down the road on "why egauge and not the orm of another sensor".)

The database is created as part of a setup function leveraging sql alchemy library:

def setup():
"""
Use defined classes to create tables in the database named in config file
"""
config_path = str(Path(os.path.dirname(os.path.realpath(__file__))).parent.parent) + "/config.txt"
with open(config_path, "r") as file:
#prepend '[DEFAULT]\n' since ConfigParser requires section headers in config files
config_string = '[DEFAULT]\n' + file.read()
config = configparser.ConfigParser()
config.read_string(config_string)
db_url = "postgresql:///" + config['DEFAULT']['db']
db = create_engine(db_url)
BASE.metadata.create_all(db)

And in particular, the point of concern lies within this class of the data model in the same file:

class ScriptFolderEnum(enum.Enum):
"""
This class defines strings that could be inserted into sensor_info.sensor_type
"""
egauge = "egauge"
hobo = "hobo"
webctrl = "webctrl"

  1. According to this, the possible values of sensors are defined at the moment the database is initialized. Adding a new folder sensor, therefore requires not only code be added, but manually editing every database that exists on the server to add the new sensor. There is also no clear way that a new programmer will know they have to edit this column.

  2. There is also a concern if the responsibility of verifying, before adding to the crontab, if the script even exist. Currently, verification does not exist. init_crontab.py passively assumes, since the column sensor_type is enum, that the value will never be inserted incorrectly.

Solution

Not sure yet. The current implementation makes sense because values are inserted directly in the database, and an enum shows on Navicat as a dropdown menu instead of letting Eileen edit the text, which a typo would suffice to mess the crontab file. However, in the long-run due to 1) this solution may prove confusing to someone adding a new sensor or troublesome as they will have to manually edit the database. Code changes alone are insufficient.

We would like a solution similar to what is observed on init_crontab.py on what concerns adding a new folder:

lonoa/init_crontab.py

Lines 54 to 62 in 65609c5

for script_folder in script_folders:
# use ".value" to access value of script_folder enum
script_folder = script_folder.value
# hobo has a different script name
if script_folder == 'hobo':
script_filename = 'extract_hobo.py'
else:
script_filename = 'api_' + script_folder + '.py'
database_active_scripts.append(script_filename)

So long a new script is added following the convention api_ etc, modifying init_crontab.py is even unnecessary altogether!

@matthew-schultz
Copy link
Member

@carlosparadis

Hmm it's difficult to find a more ideal solution for where to keep an authoritative list of sensor_info.script_folder values.

After we combine the orm files into one orm file and moving to to the top level lonoa directory, the main choices I can think of that I could probably finish up soon are:

option 1.) Keeping the script_folder class in the orm file

  • Future updates would require:
    • updating the sensor_info.script_folder enum column in the database
    • updating the ScriptFolderEnum class in the orm file.

option 2.) Moving the ScriptFolderEnum class from orm file to init_crontab

  • update the orm file to import init_crontab.ScriptFolderEnum
  • Future updates would require:
    • updating sensor_info.script_folder enum colum in the database
    • updating the ScriptFolderEnum class in init_crontab

And then updating the wiki with a more clear section on adding a new sensor type section.

I will try to update the wiki with the steps for the current method of adding new sensor types this weekend, unless we decide on a different option

@carlosparadis
Copy link
Member Author

carlosparadis commented Dec 22, 2019

@matthew-schultz Let me know what you think of this:

In a package I use in R, called "pkgdown", there is a convention where any folder or file starting with _ is to be ignored when performing package functions (e.g. generate docs).

So if we were to use that here:

  • egauge
  • _rhasberry_pi
  • webctrl
  • hobo
  • _weather_underground
  • _imonnit

Rhasberry Pi, Imonnit, and Weather Underground would all have a prefix of _ because they don't currently have a /script/api_.py, and hence Lonoa can't add data as it does to egauge-webctrl, hobo by having Eileen add it's sensor name to the script_folder.

The _ convention is also used in programming languages to indicate its internal, or that it should not be expected to be messed with (e.g. C): https://stackoverflow.com/questions/39625352/why-do-some-functions-in-c-have-an-underscore-prefix

I was thinking that if we capitalize on this to rename the folders, we can address the problem we currently have: That of the sensor names being hardcoded on the orm.py, and consequently old projects all have to be manually edited on the database manually to incorporate the new sensor, since init_database.py is only executed once.

The solution would simply have the orm.py enum class create the datatype of enum but with no acceptable value to begin with (i.e. instead of "egauge", "webctrl", "hobo", it wouldn't have anything there).

We would move the responsibility of saying "which values are ok" to init_crontaby.py by making the list of values it can take dynamical: We move the current sensor folders inside a folder called sensors, and adopt the _ notation. The init_crontab.py then, after init_database.py is executed, checks lonoa/sensors for all folders which do not contain the preceding _ and alter the column every time it executes if it detects a new folder in there that doesn't match the available values.

Provided we explicit write on the README a "How to add a sensor" and mention people should not create a folder within sensors without a _, it should be fine. Even if this is disregarded, the only occasion it would break is if Eileen would type the new folder name and the person programming it forgot the _. We could just add a try catch on the api file not existing, and logging on the error table. Since the cronjob runs the sensors independently, this kind of error would not block the other sensor data to keep coming in and being logged appropriately.

I also think the _ notation and this solution would reinforce the convention we already have on the init_crontab.py made by you: Provided new folder and scripts are added to <sensor_folder>/script/api_<sensor_folder>.py, the init_crontab.py does not need to ever be changed. It makes the new hire life easier, and at the same time it keeps the folder hierarchy organized. It's a win-win.

Eileen gets to keep a column where she can see the available sensor names, and old projects can be updated.

@matthew-schultz
Copy link
Member

@carlosparadis I like your solution!

That's pretty elegant how

  1. init_crontab would handle updating the values of the script_folder enum in the database
  2. the script_folder enum's values would be every named folder without a _ at the beginning.

Though we'd just have to see if it could be implemented and tested over the break.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

2 participants