Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replacing multiple children in parent #96

Open
chrisjsewell opened this issue Feb 12, 2019 · 7 comments
Open

Replacing multiple children in parent #96

chrisjsewell opened this issue Feb 12, 2019 · 7 comments
Labels
question todo Stuff I want to update/fix/improve

Comments

@chrisjsewell
Copy link
Contributor

chrisjsewell commented Feb 12, 2019

Heya, could you suggest a best-practice way to do this:

Basically, I want to replace <cite data-cite="cite_key">text</cite> with \cite{cite_key}.

My approach so far is to find the opening tag, then traverse right to find the closing tag:

def html_to_latex(element, doc):

    if doc.format not in ("latex", "tex"):
        return None

    if (isinstance(element, pf.RawInline) and
            element.format in ("html", "html5")):
        match = re.match(
            r"<cite\s*data-cite\s*=\"?([^>\"]*)\"?>", element.text)
        if match:
            # look for the closing tag
            closing = element.next
            
            while closing:
                if (isinstance(element, pf.RawInline) and
                        element.format in ("html", "html5")):
                    endmatch = re.match(r"^\s*</cite>\s*$", closing.text)
                    if endmatch:
                        break
                closing = closing.next
            
            if closing:
                new_content = pf.RawInline(
                    "\\cite{{{0}}}".format(match.group(1)), format="tex")

                # traverse left and right, to find surrounding content
                init_content = []
                prev = element.prev
                while prev:
                    init_content.insert(0, prev)
                    prev = prev.prev
                final_content = []
                final = closing.next
                while final:
                    final_content.append(final)
                    final = final.next
                
                final_block = (
                    init_content + [new_content] + final_content)

                element.parent.content = final_block

However, this does not replace the parent content in the final doc!?

@sergiocorreia
Copy link
Owner

You might be missing is a return statement. If the function returns None (the default if you don't specify a return), else the walk() function returns the original object.

Also, it's not best practice to modify the parent from the child function, and I'm not really sure if it's needed. Can you just create a new element directly and return that? (or I might be missing something)

@chrisjsewell
Copy link
Contributor Author

Thanks for the quick reply. Maybe I'm not understanding the process correctly.
My understanding was that, if you return an element, it replaces the element supplied to the action,
with the returned element?

If this is the case, then surely you can't just return a modified parent to change it?
How do I, in essence "walk back" to the parent and return the modified one?

My solution/hack at the moment, is to record any elements that need to be deleted, then
delete them at the end:

def action(element, doc):
    ...
    doc.to_delete.setdefault(element.parent, set()).update(delete_content)

def prepare(doc):
    # type: (Doc) -> None
    doc.to_delete = {}

def finalize(doc):
    # type: (Doc) -> None
    for element, delete in doc.to_delete.items():
        element.content = [e for e in element.content if e not in delete]
    del doc.to_delete

@chrisjsewell
Copy link
Contributor Author

Can you just create a new element directly and return that?

I don't want to create, I want to destroy! (lol), i.e. I want to remove elements neighbouring the current element, whilst potentially keeping that element the same

@sergiocorreia
Copy link
Owner

I don't want to create, I want to destroy! (lol), i.e. I want to remove elements neighbouring the current element, whilst potentially keeping that element the same

As far as I recall, to destroy I write return []. However, if you want to return neighbors maybe the best approach is to have action() act on the parent, and then walk() on its children, deleting elements. I'm a bit short of time this week and next, as otherwise I would have come up with a proof of concept of what I mean.

@chrisjsewell
Copy link
Contributor Author

Yeh the only pain with that, is that you have to decide ahead of time what the possible parent types of the element are.

Btw, if your interested I've almost finished what I was trying to achieve in this module of my ipypublish package, which effectively is an enhanced port of pandoc-xnos to panflute :)

@chrisjsewell
Copy link
Contributor Author

chrisjsewell commented Feb 16, 2019

I wrote a small convenience function, to list all elements that might contain a particular element:

import inspect
import panflute as pf

def find_allowed(targets, allow_meta=False):
    """
    >>>  find_allowed([pf.Para()])
    [panflute.elements.BlockQuote,
    panflute.elements.Definition,
    panflute.elements.Div,
    panflute.elements.Doc,
    panflute.elements.ListItem,
    panflute.elements.Note,
    panflute.elements.TableCell]
  
    """
    allowed = []
    all_elements = inspect.getmembers(
        pf.elements,
        predicate=(lambda e: inspect.isclass(e) and issubclass(e, pf.Element)))
    base_elements = inspect.getmembers(
        pf.base,
        predicate=(lambda e: inspect.isclass(e) and issubclass(e, pf.Element)))
    for name, el in all_elements:
        if (name, el) in base_elements:
            continue
        try:
            inst = el(*targets)
        except TypeError:
            continue
        if not hasattr(inst, 'content'):
            continue
        if el.__name__.startswith("Meta") and not allow_meta:
            continue
        allowed.append(el)
    return allowed

@chrisjsewell
Copy link
Contributor Author

A final (maybe!) note on this, I ended up using the helper function below.
This accounts for the special cases, such as where an Inline is within a Table caption.

FYI, my whole filter is explained here: https://ipypublish.readthedocs.io/en/latest/markdown_cells.html

import panflute as pf

def get_pf_content_attr(container, target):

    panflute_inline_containers = [
        pf.Cite,
        pf.Emph,
        pf.Header,
        pf.Image,
        pf.LineItem,
        pf.Link,
        pf.Para,
        pf.Plain,
        pf.Quoted,
        pf.SmallCaps,
        pf.Span,
        pf.Strikeout,
        pf.Strong,
        pf.Subscript,
        pf.Superscript,
        pf.Table,
        pf.DefinitionItem
    ]

    panflute_block_containers = (
        pf.BlockQuote,
        pf.Definition,
        pf.Div,
        pf.Doc,
        pf.ListItem,
        pf.Note,
        pf.TableCell
    )

    if issubclass(target, pf.Cite):
        # we assume a Cite can't contain another Cite
        if not isinstance(container, tuple(panflute_inline_containers[1:])):
            return False

    if issubclass(target, pf.Inline):
        if isinstance(container, tuple(panflute_inline_containers)):
            if isinstance(container, pf.Table):
                return "caption"
            elif isinstance(container, pf.DefinitionItem):
                return "term"
            else:
                return "content"
        else:
            return False

    if issubclass(target, pf.Block):
        if isinstance(container, tuple(panflute_block_containers)):
            return "content"
        else:
            return False

    raise TypeError("target not Inline or Block: {}".format(target))

def action(element, doc):
    content_attr = get_pf_content_attr(element, pf.RawInline)
    if not content_attr:
        return None
    content = getattr(element, content_attr)  
    ...
    setattr(element, content_attr, content)
    return element

@sergiocorreia sergiocorreia added question todo Stuff I want to update/fix/improve labels Dec 2, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question todo Stuff I want to update/fix/improve
Projects
None yet
Development

No branches or pull requests

2 participants