Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FIX] OWSql does not save selected table/query #2659

Merged
merged 7 commits into from
Oct 17, 2017
Merged

Conversation

ceprio
Copy link
Contributor

@ceprio ceprio commented Oct 8, 2017

Issue

Added functionality to the widget:

Description of changes
  • The widget now saves the settings of the selected table name and SQL Query
  • Restoring of table name and SQL query is also now working when opening a workflow
  • Connection to the database is done earlier to allow the tables list to be fetched
  • Table name is saved and restored as text (or name of table) instead of Id (as previously)
  • Empty username and password are now saved as '' (empty string) instead of None. This allows the use of SQL Server's Windows authentication (use of the current user credentials if both fields are empty).
  • Placing 'Custom SQL' in second in the table list in order to see it faster when the list of tables is long.
  • Execute SQL query only if minimum number of chars entered
    Note also:
  • A bug was found within SQL Server 2012 (and earlier versions it seem) that had to be dealt with in the backend mssql.py. I added the fix in this pull request so the widget can be tested correctly even on this SQL server version. More details in the source code.
Includes
  • Code changes
  • Tests
  • Documentation

Added functionality to the widget:
- The widget now saves the settings of the selected table name and SQL Query 
- Restoring of table name and SQL query is also now working when opening a workflow
- Connection to the database is done earlier to allow the tables list to be fetched
- Table name is saved and restored as text (or name of table) instead of Id (as previously)
- Empty username and password are now saved as '' (empty string) instead of None. This allows the use of SQL Server's Windows authentication (use of the current user credentials if both fields are empty).
- Placing 'Custom SQL' in second in the table list in order to see it faster when the list of tables is long.
- Execute SQL query only if minimum number of chars entered
@CLAassistant
Copy link

CLAassistant commented Oct 8, 2017

CLA assistant check
All committers have signed the CLA.

Copy link
Member

@astaric astaric left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the pull request.

I have left some comments, but they are mostly related to code style/naming, the features you have implemented look ok. Let me know when you address the comments or if you have any questions.

@@ -1,3 +1,5 @@
#!/usr/bin/python
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As widgets are mostly used from Canvas, we usually do not include this. For testing, widget can be run with python widget_module.py.

def data(self, index, role=Qt.DisplayRole):
row = index.row()
if role == Qt.DisplayRole:
return str(self[row])
return super().data(index, role)


class BackendModel(PyListModel):
class BackendModels(PyListModel):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Model is a part of the Qt's Model/View paradigm and should remain singular.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was surprised to see that the Model was is fact a list of backends and you have to select one. Thus the 's'. I'll correct.

@@ -85,10 +87,10 @@ def __init__(self):
vbox = gui.vBox(self.controlArea, "Server", addSpace=True)
box = gui.vBox(vbox)

self.backendmodel = BackendModel(Backend.available_backends())
self.backendmodels = BackendModels(Backend.available_backends())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above, model should remain singular. If you prefer plural, I would simply use
self.backends = BackendModel(...)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, ok I'll do that.

index = self.tablecombo.findText(str(self.table))
if index != -1:
self.tablecombo.setCurrentIndex(index)
self.tablecombo.activated[int].connect(self.select_table) # set up the callback to select_table in case of selection change
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you move the commend in a line before the command? We try to limit the lines in source code to maximum of 100 characters (though some go even further and limit it to 79)


def select_table(self):
"Called on tablecombo selection change"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

""" is usually used for docstrings

@@ -347,6 +360,7 @@ def get_table(self):
domain = s.get_domain(inspect_values=True)
self.Information.data_sampled()
else:
#
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove

@codecov-io
Copy link

codecov-io commented Oct 11, 2017

Codecov Report

Merging #2659 into master will increase coverage by 0.34%.
The diff coverage is 2.63%.

@@            Coverage Diff            @@
##           master   #2659      +/-   ##
=========================================
+ Coverage   75.46%   75.8%   +0.34%     
=========================================
  Files         331     338       +7     
  Lines       57967   59469    +1502     
=========================================
+ Hits        43743   45082    +1339     
- Misses      14224   14387     +163

@@ -20,7 +21,7 @@
MAX_DL_LIMIT = 1000000


class TableModel(PyListModel):
class TableModels(PyListModel):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one should be singular Model as well

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should have anticipated this one.

@@ -124,17 +125,24 @@ def __init__(self):
box.layout().addWidget(self.passwordtext)

self._load_credentials()
self.tablemodels = TableModels()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tablemodel or tables, but not tablemodels

@@ -347,6 +360,7 @@ def get_table(self):
domain = s.get_domain(inspect_values=True)
self.Information.data_sampled()
else:

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not think this empty line is necessary

cur.execute("SET SHOWPLAN_XML ON")
try:
cur.execute(query)
result = cur.fetchone()
return int(self.EST_ROWS_RE.search(result[0]).group(1))
except AttributeError:
# This is to catch a bug from SQL Server 2012
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand correctly, the query plan in SQL Server 2012 produces an XML, which does not match the EST_ROWS_RE regexp as it represents the number of rows as float? If this is the case, why not update the regular expression to match both?

Instead of

StatementEstRows="(\d+)"

which expects an integer between the parentheses, we could use something like this

r'StatementEstRows="(\d+)(\.\d*)?"'

the second regular expressions allow the number to end with a decimal point and some numbers, but only parses the whole part

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not exacty, after searching it seem that our SQL server statistics are not maintained often enough and this results in bad values being returned in form of a float thus the AttributeError. I'm not sure if a well maintained server would return a float: I have a second database where statistics are maintained and I never received a float from the Plan query.
It would require more samples to figure out what to do. This code works in all situations, the drawback is the execution time if a float is returned. Maybe there should be a warning issued like those deprecated warnings found in some libraries saying that "statistics are out of date, SELECT count(*) used." but I do not know how to throw such a message in Orange.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I modified the code comments to explain the reason of the AttributeError. If someone ever comes back with a float with the right estimated result, the code will have to be updated as you suggested and if possible with a sanity check on the statistics (so un-maintained statistics would be caught).

@ceprio
Copy link
Contributor Author

ceprio commented Oct 14, 2017

The code has been updated, please check comments (especially mssql.py).
Thx

except AttributeError:
# This is to catch a float received in StatementEstRows
# a float is received when the server's statistics are out of date.
pass
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have read your comment and I agree, if the float is a result of a reallly bad estimate, it should not be used. I still do not like the flow of this function though. I would put it like this:

match = self.EST_ROWS_RE.search(result[0])
if not match:
    # Comment explaining why this happens
    return None
return int(match.group(1))

raise BackendError(str(ex)) from ex
# In case of an AttributeError, give a second chance:
# Use the long method for counting
cur.execute("SELECT count(*) FROM ( {} ) x".format(query))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should not be necessary. When count_approx returns None, an exact count is already performed (see SqlTable.count_approx)

@ceprio
Copy link
Contributor Author

ceprio commented Oct 16, 2017

Your proposal makes much sense, I had not seen SqlTable.count_approx. Thank you for that.
Tested with success and your way also catches the non-existance of StatementEstRows in the result if that ever appends.
Thank you,

Eclipse does not seem to be able to take care of these...
@astaric astaric changed the title Update owsql.py [FIX] OWSql does not save selected table/query Oct 17, 2017
@astaric astaric merged commit 37616b9 into biolab:master Oct 17, 2017
@astaric
Copy link
Member

astaric commented Oct 17, 2017

Thank you for your contribution. It will be included in the next release (planned for the last week of October)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants