Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ENG-4010]Codeblock cleanup in markdown #4233

Merged
merged 21 commits into from
Nov 8, 2024
Merged

Conversation

ElijahAhianyo
Copy link
Collaborator

@ElijahAhianyo ElijahAhianyo commented Oct 24, 2024

Decouple the code block component (React Syntax Highlighter) from the markdown component. Also move the Markdown component map function var creation logic to the component level.
With this change, the react syntax highlighter logic/imports don't get compiled when a different code block is used.

Comment on lines 347 to 361
def _get_map_fn_var_from_children(self, component: Component, tag: str) -> Var:
"""Create a function Var for the component map for the specified tag.

Args:
component: The component to check for custom code.
tag: The tag of the component.

Returns:
The formatted component map.
The function Var for the component map.
"""
components = {
tag: Var(
_js_expr=f"(({{node, {_CHILDREN._js_expr}, {_PROPS._js_expr}}}) => ({self.format_component(tag)}))"
)
for tag in self.component_map
}
if isinstance(component, MarkdownComponentMap):
return component.create_map_fn_var(f"({self.format_component(tag)})")

# Separate out inline code and code blocks.
components["code"] = Var(
_js_expr=f"""(({{node, inline, className, {_CHILDREN._js_expr}, {_PROPS._js_expr}}}) => {{
const match = (className || '').match(/language-(?<lang>.*)/);
const language = match ? match[1] : '';
if (language) {{
(async () => {{
try {{
const module = await import(`react-syntax-highlighter/dist/cjs/languages/prism/${{language}}`);
SyntaxHighlighter.registerLanguage(language, module.default);
}} catch (error) {{
console.error(`Error importing language module for ${{language}}:`, error);
}}
}})();
}}
return inline ? (
{self.format_component("code")}
) : (
{self.format_component("codeblock", language=Var(_js_expr="language", _var_type=str))}
);
}})""".replace("\n", " ")
)
# fallback to the default fn Var creation if the component is not a MarkdownComponentMap.
return MarkdownComponentMap.create_map_fn_var(f"({self.format_component(tag)})")
Copy link
Collaborator Author

@ElijahAhianyo ElijahAhianyo Oct 31, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The behavior here is to get the map_fn_var of the component. However just like the existing behaviour was (before this PR), it is--and was--assumed that the targetted component is not wrapped in a parent component or the component is not memorized (using rx.memo). This assumption is a bit problematic.
Eg.

component_map = {
"p": lambda value: Text.create(value, margin_y="1em"),
}

If the component in the lambda fn (Text) mixes MarkdownComponentMap, it is straightforward to obtain the map_fn_var. However, if the component returns the component in a wrapper (eg. Text.create returning Text wrapped in a Box), then its not obvious how to extract the var. Another eg is the case of returning a memorized fn that returns a component instead of a direct component:

@rx.memo
def code_block(code: str, language: str):
    return rx.box(
        rx.code_block(
            code,
            language=language,
            class_name="code-block",
            can_copy=True,
        ),
        class_name="relative mb-4",
    )

def code_block_markdown(*children, **props):
    language = props.get("language", "plain")
    return code_block(code=children[0], language=language)

...
rx.markdown("# header", component_map={"codeblock": code_block_markdown}),

(codeblock is treated differently because of the inline logic we have so this should be fine but think of an alternative eg)

A fix for this was to recursively traverse the component tree and find the child with the appropriate var, but this introduces another problem where if a parent component is of the MarkdownComponentMap type, we might be getting the wrong var. In this case, what do we do? I think this is one to think about or consider. I initially had this implementation as a recursive one but I had to revert this to follow the exact behavior as was previously

@ElijahAhianyo ElijahAhianyo changed the title [WIP]Codeblock cleanup in markdown [ENG-4010]Codeblock cleanup in markdown Oct 31, 2024
Copy link

linear bot commented Oct 31, 2024

Copy link
Collaborator

@masenf masenf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i like the approach of making it customizable per component and taking the special case out of the template rendering logic.

as for limiting the recursive searching, i think that's also okay. as it stands now, there's just no way to customize the render function var for custom components; if a user needs a custom composite component, they could just mix Box and MarkdownComponentMap and do what they need to do.

@ElijahAhianyo ElijahAhianyo force-pushed the elijah/markdown-cleanup branch 2 times, most recently from a278535 to e06ed3f Compare November 1, 2024 11:30
@ElijahAhianyo ElijahAhianyo marked this pull request as ready for review November 1, 2024 12:59
Copy link
Member

@adhami3310 adhami3310 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Look good in general! I feel there's something in general of

when language is given to code that is not a literal string var then add the dynamic import, otherwise use static imports

in which case, markdown should be a lot simpler

reflex/components/markdown/markdown.py Outdated Show resolved Hide resolved
@ElijahAhianyo ElijahAhianyo force-pushed the elijah/markdown-cleanup branch 2 times, most recently from f93ef4f to 8f4f661 Compare November 6, 2024 18:01
@adhami3310
Copy link
Member

It seems using markdown with this always imports python syntax highlighting even when no python block is there, why is that?

@adhami3310
Copy link
Member

This fails:

component_map={
    "codeblock": lambda value, **props: rx.text(
        "Props: ", props, ". Value: ", value
    ),
},

@adhami3310
Copy link
Member

adhami3310 commented Nov 6, 2024

This:

rx.code_block(
    """use reflex::events::Events;""",
    language="rust",
)

Returns something like:

export function Prismasynclight_8ebdbf53b7f95eef99249393f8836500() {
  const { resolvedColorMode } = useContext(ColorModeContext);

  const _language = "rust";

  if (_language) {
    (async () => {
      try {
        const module = await import(
          `react-syntax-highlighter/dist/cjs/languages/prism/${_language}`
        );
        SyntaxHighlighter.registerLanguage(_language, module.default);
      } catch (error) {
        console.error(
          `Error importing language module for ${_language}:`,
          error
        );
      }
    })();
  }

  return (
    <SyntaxHighlighter
      children={"use reflex::events::Events;"}
      language={_language}
      style={resolvedColorMode === "light" ? oneLight : oneDark}
    />
  );
}

Why can't this be a static import?

@ElijahAhianyo
Copy link
Collaborator Author

This fails:

component_map={
    "codeblock": lambda value, **props: rx.text(
        "Props: ", props, ". Value: ", value
    ),
},

this should be fixed now

@ElijahAhianyo
Copy link
Collaborator Author

It seems using markdown with this always imports python syntax highlighting even when no python block is there, why is that?

I think its coming from the _get_custom_code in markdown

def _get_custom_code(self) -> str | None:
        hooks = {}
        for _component in self.component_map.values():
            comp = _component(_MOCK_ARG)
            hooks.update(comp._get_all_hooks_internal())
            hooks.update(comp._get_all_hooks())

Moving the syntaxhighlighter language registration logic into the CodeBlock as a hook means when extracting all hooks from a component, it includes the code block hook with the default set language (python).
I think a way to get around that was a condition to exclude hooks from the CodeBlock component and one way to do that was to modify the _get_all_hooks API to accept a list of component types to ignore when getting hooks of its children, but we cant do that without a breaking change. Alternatively, I thought we could implement a modified version of the _get_all_hooks here (i.e, going down the tree to exclude the CodeBlock component here without calling comp._get_all_hooks)

@ElijahAhianyo
Copy link
Collaborator Author

ElijahAhianyo commented Nov 7, 2024

This:

rx.code_block(
    """use reflex::events::Events;""",
    language="rust",
)

Returns something like:

export function Prismasynclight_8ebdbf53b7f95eef99249393f8836500() {
  const { resolvedColorMode } = useContext(ColorModeContext);

  const _language = "rust";

  if (_language) {
    (async () => {
      try {
        const module = await import(
          `react-syntax-highlighter/dist/cjs/languages/prism/${_language}`
        );
        SyntaxHighlighter.registerLanguage(_language, module.default);
      } catch (error) {
        console.error(
          `Error importing language module for ${_language}:`,
          error
        );
      }
    })();
  }

  return (
    <SyntaxHighlighter
      children={"use reflex::events::Events;"}
      language={_language}
      style={resolvedColorMode === "light" ? oneLight : oneDark}
    />
  );
}

Why can't this be a static import?

I think the idea was to reduce the number of special cases when importing and registering languages for CodeBlock especially when using the CodeBlock in a memoized component or in the component map of the markdown. By keeping a unified logic, it becomes easier to understand how imports work in all scenarios. From a performance perspective, my understanding is that loading and registering languages lazily should not have a negative impact. When there are multiple code blocks using the same language on a page, the browser only loads the file once and subsequently retrieves it from the cache

@masenf masenf merged commit cd59ab5 into main Nov 8, 2024
34 checks passed
@masenf masenf deleted the elijah/markdown-cleanup branch November 8, 2024 03:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants