Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hoist reduction loops #104

Merged
merged 31 commits into from
Nov 30, 2016
Merged

Hoist reduction loops #104

merged 31 commits into from
Nov 30, 2016

Conversation

FabioLuporini
Copy link
Contributor

@FabioLuporini FabioLuporini commented Nov 25, 2016

This PR introduces support for hoisting reduction loops (eg, sum-factorisation in TSFC kernels), alongside a number of additional contributions:

  • Simplification and generalisation of hoisting algorithms, which now work with loop nests of arbitrary depth
  • Generalisation of sharing-graph-based expression rewriting, which now works with n-linear forms (and not just for n <= 2)
  • Generalisation of the CSE's cost model, which now handles loop nests of arbitrary depth (this was necessary given the tensor product structure of basis functions)

Also:

  • Some obsolete code has been dropped.
  • Some ancient code has been refactored.
  • Various components and utility functions have been improved/enhanced/fixed.

fixes #76, fixes #98, closes #99, fixes #100, resolves #101

@@ -466,6 +466,14 @@ def is_const(self):
return not self.rank or all(is_const_dim(r) for r in self.rank)

@property
def is_number(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could have a specific number type if you wish...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a list of changes for COFFEE in mind including a restructuring of base.py, as well as the addition of new types (eg Number) and class attributes (eg is_Symbol, is_Incr). I'll do this in another PR

@@ -1209,6 +1222,26 @@ def gencode(self, not_scope=False):
return self.children[0].gencode()


class Rank(tuple):

def __contains__(self, val):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you happy with the usage that sticks an COFFEE expression into rank?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm "kinda happy", in the sense that this Rank class acts as a work around, although there's probably a cleaner solution

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fine, I do not mind having the interface cleanup (which will require corresponding changes in other components) in a separate PR after this one is merged.

self.symbols = symbols
self.flatten = flatten
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is odd, why do you assign a static function onto self?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

work around to circular imports, as I'm importing utils and utils is importing some visitors. Not sure where I should move flatten ? the other option is to re-import flatten in the member methods that use it

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure where I should move flatten ?

coffee.base might work I think, although it feels a bit arbitrary. Or you could drop flatten altogether and use chain-star instead:

flatten(xs) => list(chain(*xs)) or list(chain.from_iterable(xs)).

If you directly use this in a for, then you don't even need the list. Of course, you will need from itertools import chain.

the other option is to re-import flatten in the member methods that use it

I think that would express the intention (of breaking circular imports) more clearly than assigning to self.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm now importing it directly in the relevant member, that is indeed better.

I didn't know about chain; will probably use it all over the place in a next PR

@@ -1,6 +1,6 @@
[flake8]
ignore =
E501,F403,F405,E226,E265,E731,E402,E266,
E501,F403,F405,E226,E265,E731,E402,E266,F999,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What fails without F999? On my machine flake8 is still happy without this option.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe because you have a different global setup? I get hundreds of warnings without F999

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a global setup? Are you sure your flake8 is up to date?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

flake8 --version 3.2.1 (pyflakes: 1.3.0, mccabe: 0.5.2, pycodestyle: 2.2.0) CPython 2.7.12 on Linux

I think with flake8 you can also have a global config file wherein you state all of the options that you want to be ignored in any project -- but maybe that's not your case

# Construct the sharing graph
nodes, edges = [], []
for i in summands(self.stmt.rvalue):
symbols = zip(*explore_operator(i))[0] if not isinstance(i, Symbol) else [i]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you merge with master, you get Python 3 style zip from six.moves, which means this doesn't work. You need to write list(zip(...))[0] instead of zip(...)[0].

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch, will fix

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I fixed most cases on master. This usage of zip, of course, works in Python 2 if you don't have from six.moves import zip. So I suggest to fix these in the additions of this patch, and I also suggest to add the corresponding six.moves import (filter, map, range, zip -- if used in the file).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

foo(a) --> []
"""

handle = zip(*explore_operator(node))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested fix: cast iterable to list. The iterable is always "true" in a condition, whether it will yield any elements or not.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@miklos1
Copy link
Contributor

miklos1 commented Nov 28, 2016

When testing against master branches, I get failures in Firedrake tests/regression/test_conditional.py

@miklos1
Copy link
Contributor

miklos1 commented Nov 28, 2016

The conditional problem seems to be fixed, for example, with this patch:

diff --git a/coffee/expander.py b/coffee/expander.py
index 9d41b40..9ece54c 100644
--- a/coffee/expander.py
+++ b/coffee/expander.py
@@ -73,7 +73,7 @@ class Expander(object):
             return ([node], self.EXPAND) if self.should_expand(node) \
                 else ([node], self.GROUP)
 
-        elif isinstance(node, (Div, FunCall)):
+        elif isinstance(node, (Div, Ternary, FunCall)):
             # Try to expand /within/ the children, but then return saying "I'm not
             # expandable any further"
             for n in node.children:
diff --git a/coffee/hoister.py b/coffee/hoister.py
index 52147d4..5da6d9b 100644
--- a/coffee/hoister.py
+++ b/coffee/hoister.py
@@ -83,7 +83,7 @@ class Extractor(object):
 
         elif isinstance(node, (FunCall, Ternary)):
             arg_deps = [self._visit(n) for n in node.children]
-            dep = tuple(set(flatten([dep for dep, _ in arg_deps])))
+            dep = set(flatten([dep for dep, _ in arg_deps]))
             info = self.EXT if all(i == self.EXT for _, i in arg_deps) else self.STOP
             return (dep, info)
 
diff --git a/coffee/rewriter.py b/coffee/rewriter.py
index 289b150..b20e8fa 100644
--- a/coffee/rewriter.py
+++ b/coffee/rewriter.py
@@ -409,7 +409,7 @@ class ExpressionRewriter(object):
             if isinstance(node, (Symbol, Div)):
                 return
 
-            elif isinstance(node, (Sum, Sub, FunCall)):
+            elif isinstance(node, (Sum, Sub, FunCall, Ternary)):
                 for n in node.children:
                     _reassociate(n, node)
 

@miklos1
Copy link
Contributor

miklos1 commented Nov 28, 2016

There is another issue against master branches:

  ...

  A[0][0] = A00[0][0];
  A[0][1] = A01[0][0];
  A[0][2] = A00[0][1];
  A[0][3] = A01[0][1];
  A[1][0] = A10[0][0];
  A[1][1] = A11[0][0];
  A[1][2] = A10[0][1];
  A[1][3] = A11[0][1];
  A[2][0] = A00[1][0];
  A[2][1] = A01[1][0];
  A[2][2] = A00[1][1];
  A[2][3] = A01[1][1];
  A[3][0] = A10[1][0];
  A[3][1] = A11[1][0];
  A[3][2] = A10[1][1];
  A[3][3] = A11[1][1];
  
  for (int  j  = 0; j < 2; j += 1)
  {
    for (int  k  = 0; k < 2; k += 1)
    {
      #pragma coffee expression
      A00[j][k] += (t0[j] * ct5[k]) + (t1[j] * ct6[k]);   
    }
  }
  
  for (int  j  = 0; j < 2; j += 1)
  {
    for (int  k  = 0; k < 2; k += 1)
    {
      #pragma coffee expression
      A11[j][k] += (t0[j] * ct12[k]) + (t1[j] * ct13[k]);
    }
  }

Obviously, the A[j][k] = ... copying should remain at the end of the function. This does not happen with FInAT, since that doesn't need this rearrangement hack for interior facet integrals with vector arguments.

@FabioLuporini
Copy link
Contributor Author

I hadn't tested with firedrake master yet.

The patch looks fine (the tuple(set(... --> set(... is a good catch)

Tell me how to reproduce the other bug, which should be an easy fix

@miklos1
Copy link
Contributor

miklos1 commented Nov 28, 2016

Tell me how to reproduce the other bug, which should be an easy fix

firedrake tests/extrusion/test_interior_facets_extr.py (use master branches)

@FabioLuporini
Copy link
Contributor Author

These should now be fixed

@miklos1
Copy link
Contributor

miklos1 commented Nov 29, 2016

Does not actually fix #97, but I don't think it matters any more. We may still close that as wontfix because of firedrakeproject/tsfc@1d7ce21.

Copy link
Contributor

@miklos1 miklos1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is now OK to go.

@FabioLuporini
Copy link
Contributor Author

I'm also happy to land this. Is that fine for you if I just hit merge, or do you prefer to go in synch with some other PRs?

@miklos1
Copy link
Contributor

miklos1 commented Nov 29, 2016

I'm fine with just merging, I was just wondering if @wence- has anything to say.

@miklos1 miklos1 merged commit 0a2a19a into master Nov 30, 2016
@miklos1 miklos1 deleted the hoist-reduction-loops branch November 30, 2016 11:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants