Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to add a custom token inside a block token? #194

Open
MilyMilo opened this issue Aug 22, 2023 · 3 comments
Open

How to add a custom token inside a block token? #194

MilyMilo opened this issue Aug 22, 2023 · 3 comments
Labels

Comments

@MilyMilo
Copy link

MilyMilo commented Aug 22, 2023

Hi! Sorry if this is a silly question - I've just started using mistletoe today! (I really like the interface, thank you for your work!)

I'd like to implement an [embed]: (file.txt) token, which will read the file and include it inline.

I have code like this:

class Embed(SpanToken):
    parse_inner = False
    pattern = re.compile(r"\[embed\]\s*:?\s*#?\s*\((.*)\)")

    def __init__(self, match):
        self.include_path = match.group(1)


class WriteupRenderer(HtmlRenderer):
    def __init__(self, **kwargs):
        super().__init__(Embed, **kwargs)

    def render_embed(self, token: Embed):
        # will read and output the file contents
        return "RENDERING EMBED"

The only issue is that most of my embeds live inside code blocks or code fences like:

`code with [embed]: (./shell64.s)`

Which results in markup that's expected to me now, but I'd like to change this behaviour:

<p><code>code with [embed]: (./shell64.s)</code></p>

(My embed tokens work fine outside of code blocks and fences)

From what I could see, the CodeBlock does not parse inner tokens, how to go about implementing this? Is it possible without re-creating too many tokens / blocks?

@anderskaplan
Copy link
Contributor

Hi @MilyMilo, it is certainly possible to make custom tokens for fenced code blocks and for code spans with the behavior that you want, but I believe it's going to be a lot of work.

The embedding feature you want would probably be easier to build as a pre-processing step, working on the raw text input. That would also make sense from a conceptual point of view, imho.

Maybe you could use a cusomized parser which only recognizes paragraphs and links in that first step 😄

@pbodnar
Copy link
Collaborator

pbodnar commented Aug 27, 2023

Hi @MilyMilo, as @anderskaplan writes, it would probably make sense to do the embedding as a pre-processing step - i.e. before calling mistletoe?

Or, I think you could alternatively try just to override the corresponding HtmlRenderer's render methods like this (schematically):

    def render_inline_code(self, token: span_token.InlineCode) -> str:
        # TODO: embed chunks inside token.children[0].content
        return super().render_inline_code(token)

    # ...

    def render_block_code(self, token: block_token.BlockCode) -> str:
        # TODO: embed chunks inside token.children[0].content (the same as above => call a shared helper method)
        return super().render_block_code(token)

What do you think?

@MilyMilo
Copy link
Author

Thank you @pbodnar and @anderskaplan for pointers!

As both of you mentioned - this is currently implemented in preprocessing. We just have a regex to do that, however doesn't feel very clean and robust. That's why I wanted to use mistletoe to improve this.

I'll check if there's a nice way to get it working and post the code here if I end-up implementing this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants