Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Code Highlighting Example double characters #4445

Closed
samarsault opened this issue Aug 12, 2021 · 26 comments · Fixed by #4460 or #4556
Closed

Code Highlighting Example double characters #4445

samarsault opened this issue Aug 12, 2021 · 26 comments · Fixed by #4460 or #4556
Labels

Comments

@samarsault
Copy link
Contributor

samarsault commented Aug 12, 2021

Description
When typing around highlighted words, characters are inserted twice.

Recording

ezgif com-gif-maker (1)

Sandbox
https://www.slatejs.org/examples/code-highlighting

Expectation
Character should be inserted only once

Environment

  • Operating System: macOS
  • Browser: Chrome
@samarsault samarsault added the bug label Aug 12, 2021
@samarsault samarsault changed the title Code Highlighting example double characters Code Highlighting Example double characters Aug 12, 2021
@kalinkochnev
Copy link

kalinkochnev commented Aug 15, 2021

I was having this issue (alongside some other ones) so I tried making my own improvements. I just made this a couple of days ago, so it's not very high quality, but it gets the job done. Here is how it looks with my own theme:

codehighlight

If anyone has any questions, please ask! I stripped down some of my editor specific implementation, but it should work for the most part if anyone else wants to try.

Issue
I'm not entirely sure why the highlighting leaks onto the other text, but my implementation doesn't have that issue. What I noticed with the slate example for code highlighting is that only does "surface level" highlighting, in the sense that Prism tokenizes <h1> as Punctuation --Tag -- Punctuation, but the example looks at the top level and only sees Tag, so it highlights it all purple.

My Fixes

  • Can use any prism compatible stylesheet
  • Highlights all prism tokens even if they're nested by recursing through the Prism.tokenize results
  • Also seems to fix the double text when typing near highlighted text
import Prism from "prismjs";
import "prismjs/themes/prism-coy.css";
import components from "prismjs/components"
import { Text } from 'slate';

const push_string = (token, path, start, ranges, token_type="text") => {
    ranges.push({
        "prism_token": token_type,
        anchor: { path, offset: start },
        focus: { path, offset: start + token.length },
    })
    start += token.length
    return start;
}

// This recurses through the Prism.tokenizes result and creates stylized ranges based on the token type
const recurseTokenize = (token, path, ranges, start, parent_tag) => {
    // Uses the parent's token type if a Token only has a string as its content
    if (typeof token === 'string') { 
        return push_string(token, path, start, ranges, parent_tag)
    }
    if ('content' in token) {
        // Calls recurseTokenize on nested Tokens in content
        for (const subToken of token.content) {
            start = recurseTokenize(subToken, path, ranges, start, token.type);
        }
        return start
    }
}

const decorateCodeFunc = (editor, [node, path]) => {
    const ranges = []
    if (!Text.isText(node)) {
        return ranges
    }
    
    // You can use this to specify a language like in the gif if you add a bit of logic
    let language = "html";
    let lang_aliases = { "html": "markup" }
    if (language in lang_aliases) {
        language = lang_aliases[language]
    }
    if (!(language in components.languages)) {
         return ranges
    }

    // If you wanna import dynamically use this line, but beware the massive import (211 KB!!)
    // require(`prismjs/components/prism-${language}`)
    const tokens = Prism.tokenize(node.text, Prism.languages[language])
    
    let start = 0
    for (const token of tokens) {
        start = recurseTokenize(token, path, ranges, start);
    }
    return ranges
}

// the different token types can be found on Prismjs website
const CodeLeaf = ({ attributes, children, leaf }) => {
    return (
        <span {...attributes} className={`token ${leaf.prism_token}`}>
            {children}
        </span>
    )
}

const CodeBlock = (props) => {
    return (
        <pre className="code-block" {...props.attributes}>
            <code>{props.children}</code>
        </pre>
    );
}

export {CodeBlock, CodeLeaf, decorateCodeFunc}

@echarles
Copy link
Contributor

@kalinkochnev I am new to slate and looking in using it as code editor, so your fix is interesting. I will need a bit more completed example to understand your fix, like a codesandbox or a zip with the complete project. Is it possible for you to provide such a complete example?

@kalinkochnev
Copy link

kalinkochnev commented Aug 16, 2021

@echarles Yeah I'll work on putting this up somehow as soon as I can, though in it's current state it slows down significantly with code blocks more than 40 lines long. This is an issue with both my implementation and the example syntax highlighting, but I'm working on a fix which might take a while.

I'll upload what I have in its simplest form, but I'll also include something with my performance fixes eventually.

@kalinkochnev
Copy link

kalinkochnev commented Aug 16, 2021

@echarles Here is a version that works. It currently only does html because the imports for prismjs language grammars wasn't working in the sandbox. The whole performance issue I'm talking about isn't too noticeable when highlighting html, but with something more complicated like js, it starts to get bogged down.

https://codesandbox.io/s/recursive-token-highlighting-tryc6

@echarles
Copy link
Contributor

Thx a bunch @kalinkochnev I have that working locally which is great. Indeed performance needs to be worked on. Pasting the complete HTML source of this github issue page still hangs after 1 minute. Pasting just the 50 first lines works fine.

Maybe (not sure) inspiration can be taken from codemirror/dev#109 (see improvements from https://github.com/satya164/react-simple-code-editor vs https://github.com/datavis-tech/codearea). I guess incremental parsing should be implemented with https://github.com/tree-sitter/tree-sitter or https://github.com/lezer-parser

Screenshot 2021-08-17 at 09 48 30

@echarles
Copy link
Contributor

Linking here to other discussions related to peformance

@jaked
Copy link
Contributor

jaked commented Aug 18, 2021

the doubled character bug is recent—works OK in slate-react 0.65.3 (see https://codesandbox.io/s/code-editor-bug-reproduction-0-65-3-j673k), fails in the next snapshot version 0.66.0-20217122211 (see https://codesandbox.io/s/code-editor-bug-reproduction-0-66-0-20217122211-dmdsl).

@jaked
Copy link
Contributor

jaked commented Aug 18, 2021

I bisected from the 0.65.3 tag to HEAD and it looks like the offending commit is 25afbd4, #3888.

@dylans
Copy link
Collaborator

dylans commented Aug 18, 2021

That's a shame as #3888 is pretty promising in what it strives to solve. It would be great if someone can look at it closer. We've had a lot of PRs lately around IME.

@echarles
Copy link
Contributor

echarles commented Aug 18, 2021

We have elaborated in the previous comments to more thoughts for highlighting, and now it looks like we have a clue to fix the root cause of the regression. I can spend a bit of time to try to fix it (this will allow me also to discover slate code base). Any pointer where to look at (block code that could be responsible, idea to hack/debug the behaviour) will help me in that process. Thx!

@dylans
Copy link
Collaborator

dylans commented Aug 18, 2021

The commit that broke things was a fairly big change to try to rely on native browser events more as part of React's shift towards using browser native events ( some details in https://reactjs.org/blog/2020/08/10/react-v17-rc.html ). The files changed in https://github.com/ianstormtaylor/slate/pull/3888/files would be where to focus (It's about 200 lines), but my off the cuff guess is that somewhere we're having two events fire for the insert.

@jaked
Copy link
Contributor

jaked commented Aug 18, 2021

I spent some time on it yesterday; I don't think it has to do with event handling, but rather with the rendering changes in Leaf / String. What seems to be happening is that at the boundary of a decoration, the DOM has something like

<span>&gt;</span><span>Hi!</span>

(where the decoration for &gt; is different from the decoration for Hi!)

If you select the point between &gt; and H, then type x, the browser puts the x in the first span, the text node changes, decorations are recomputed, and according to the decorations the x goes in the second span. But when the components are rendered, only the second span is redrawn, so the x remains in the first span, and we see a doubled character. This seems to implicate the Leaf / String changes.

I'll keep digging and see if I can understand exactly what's happening.

@echarles
Copy link
Contributor

Visual representation of @jaked 's explanation

Screenshot 2021-08-18 at 19 46 11

@jaked
Copy link
Contributor

jaked commented Aug 18, 2021

The issue seems to be a new React.memo call on TextString:

const TextString = React.memo(
  (props: { text: string; isTrailing?: boolean }) => {

In the example above, what happens here is that for the first span, text hasn't changed after the edit (it's still >), so we don't re-render the component. The DOM element has changed (it now contains >x), but we don't get a chance to run the new logic that compares the DOM element against text:

if (ref.current && ref.current.textContent !== text) {
      forceUpdateFlag.current = !forceUpdateFlag.current
}

If I remove the React.memo the bug goes away.

@bkrausz I think this might have been a mistake introduced in a300183 (since the React.memo doesn't have quite the same effect as the shouldComponentUpdate it replaces). I think it's safe just to remove the React.memo, what do you think?

@echarles
Copy link
Contributor

If I remove the React.memo the bug goes away.

Same here.

(since the React.memo doesn't have quite the same effect as the shouldComponentUpdate it replaces). I think it's safe just to remove the React.memo, what do you think?

No idea here...

trying out quickly a few other examples does not seem to give an issue.

Running yarn test works fine also without using React.memo.

BTW Are there any benchmark to measure any regression in terms of performance?

@dylans
Copy link
Collaborator

dylans commented Aug 18, 2021

BTW Are there any benchmark to measure any regression in terms of performance?

Not yet, but we're finally working to add tests for slate-react and I would like to see what we can do to add performance regression tests as well. I think regression performances are a big risk currently with landing PRs in slate-react.

@bkrausz
Copy link
Contributor

bkrausz commented Aug 19, 2021

@bkrausz I think this might have been a mistake introduced in a300183 (since the React.memo doesn't have quite the same effect as the shouldComponentUpdate it replaces). I think it's safe just to remove the React.memo, what do you think?

Yea, I think that's worth trying. The comment indicates that this was originally a fix for typing a letter and then undoing, since React would then fail to re-render. Can someone check that?

I will express some general FUD over the dual-path nature of my PR: I suspect we'll run into subtle bugs that have to do with the mixing of native events and handled events. Not sure if there's a longer term solution (I'm not aware of what other editors do).

@echarles
Copy link
Contributor

Not yet, but we're finally working to add tests for slate-react and I would like to see what we can do to add performance regression tests as well. I think regression performances are a big risk currently with landing PRs in slate-react.

Yes, would love seeing that but that is not a small piece of work. I have worked on benchmarks for jupyterlab (https://github.com/jupyterlab/benchmarks) and a current initiative is to run the comparison benchmark on each PR.

For the React.memo remove on String, I have displayed 1000 editable-voids examples on a single page. For both with and without React.memo` I get 12 seconds to see the widgets without CSS look, and 18 seconds to see them completely rendered with CSS. That is not scientific benchmark, but just a user perception if we change fundamentally something with that little change. It looks like it does not change fundamentally.

Yea, I think that's worth trying. The comment indicates that this was originally a fix for typing a letter and then undoing, since React would then fail to re-render. Can someone check that?

I have tried to undo with the fix (no React.memo and it works fine.

@dylans
Copy link
Collaborator

dylans commented Aug 19, 2021

Raised a PR, #4460.

@mterezac
Copy link

mterezac commented Sep 22, 2021

I'm still having this issue in the newest version.

Details
I've created a simplified version of the "code highlighting" example, and I ran the code in two different contexts: my local slatejs repo and a create-react-app. The bug only happens on the create-react-app (and in a code sandbox https://codesandbox.io/s/slate-highlight-simplified-o1pt9).

Recording
slatejs-codehighlight

@jaked
Copy link
Contributor

jaked commented Sep 22, 2021

Reproduces for me in CodeSandbox.

I'm not able to reproduce it in my local repo (on main), and it doesn't affect my app when I upgrade to the latest published Slate. I also tried your example as a static page with no build step so I could be sure what versions of things are imported (see https://glitch.com/edit/#!/mulberry-amenable-ink) but it doesn't repro there either. Weird.

@mterezac
Copy link

mterezac commented Sep 23, 2021

Yes, it's very weird. I'm still not able to reproduce the bug on my slatejs local repo, even when I changed the code example to match that of my create-react-app. Anyways, after debugging and comparing both, I found the following differences:

context: The effect of typing h to the editor with value: <h1>

  1. In the broken version, the TextString component is called twice, so the span element has key B, which I guess prevents the update from happening. Meanwhile, in the working version, the span key changes to A
  2. In the broken version, the onChange and onContextChange (from with-react, slate component and my component) do not seem to be batched.

runWithPriority$1 (webpack://slate-editor/node_modules/react-dom/cjs/react-dom.development.js#11039)
flushSyncCallbackQueueImpl (webpack://slate-editor/node_modules/react-dom/cjs/react-dom.development.js#11084)
flushSyncCallbackQueue (webpack://slate-editor/node_modules/react-dom/cjs/react-dom.development.js#11072)
scheduleUpdateOnFiber (webpack://slate-editor/node_modules/react-dom/cjs/react-dom.development.js#21199)
dispatchAction (webpack://slate-editor/node_modules/react-dom/cjs/react-dom.development.js#15660)
onChange (webpack://slate-editor/src/CodeHighlightingExample.jsx#49)
onContextChange (webpack://slate-editor/src/components/slate.tsx#41)
onChange (webpack://slate-editor/src/plugin/with-react.ts#268)
batchedUpdates$1 (webpack://slate-editor/slate/node_modules/react-dom/cjs/react-dom.development.js#21856)
onChange (webpack://slate-editor/src/plugin/with-react.ts#264)
apply (webpack://slate-editor/src/create-editor.ts#86)
apply (webpack://slate-editor/src/create-editor.ts#84)
apply (webpack://slate-editor/src/plugin/with-react.ts#133)
apply (webpack://slate-editor/src/with-history.ts#106)
flushNativeEvents (webpack://slate-editor/src/utils/native.ts#48)
flushNativeEvents (webpack://slate-editor/src/utils/native.ts#47)

@jaked
Copy link
Contributor

jaked commented Sep 23, 2021

it seems like it would be worth trying to bisect between a broken version and a working version to figure out what's different. Could you post your create-react-app code somewhere?

@mterezac
Copy link

That's what I've been trying to do without any luck.
I just added my code to github. Here is the repo: https://github.com/mterezac/slate-prism-editor (not sure if this is the best location to share, hope it's ok).

@jaked
Copy link
Contributor

jaked commented Sep 25, 2021

the Webpack build blew up for me and I wasn't sure how to debug it, so I switched to Snowpack and ripped some stuff out, see here:

https://github.com/jaked/slate-prism-editor

If you run this with yarn dev (Snowpack dev server) the bug happens; if you run yarn build; yarn serve (Snowpack build + http-server) the bug doesn't happen. (Is this the same for you with Webpack?)

environment bug
Webpack dev server yes
Webpack build ?
Snowpack dev server yes
Snowpack build no
static HTML page no
CodeSandbox yes
Next.js dev server no

???

@dylans probably worth reopening the issue :(

@jaked
Copy link
Contributor

jaked commented Sep 28, 2021

I happened to be reading A (Mostly) Complete Guide to React Rendering Behavior and this jumped out at me:

React will double-render components inside of a tag in development.

@mterezac mentioned that "In the broken version, the TextString component is called twice", and had I noticed React.StrictMode in the example code. Turns out this is the problem! If I remove StrictMode (in my Snowpack dev build) the bug goes away, and if I add StrictMode to the Slate site examples (in the Next.js dev build), the bug appears.

The existing code in TextString is

const TextString = (props: { text: string; isTrailing?: boolean }) => {
  ...

  if (ref.current && ref.current.textContent !== text) {
    forceUpdateFlag.current = !forceUpdateFlag.current
  }

  // This component may have skipped rendering due to native operations being
  // applied. If an undo is performed React will see the old and new shadow DOM
  // match and not apply an update. Forces each render to actually reconcile.
  return (
    <span data-slate-string ref={ref} key={forceUpdateFlag.current ? 'A' : 'B'}>
    ...

which, if rendered twice, will flip forceUpdateFlag twice, producing an unchanged key. A fix would be to keep a counter instead of a flag, increment it instead of flipping the flag, and use the counter value as the key. And in fact this fix works. PR forthcoming.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
7 participants