Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[IDE] - Highlighting is broken #28

Open
harm-smits opened this issue Mar 25, 2020 · 9 comments
Open

[IDE] - Highlighting is broken #28

harm-smits opened this issue Mar 25, 2020 · 9 comments
Assignees

Comments

@harm-smits
Copy link

harm-smits commented Mar 25, 2020

The highlighting of logical blocks seems a bit weird, take a look at the attached screenshot:

broken_highlighting

@ajkhoury
Copy link
Owner

Yeah this issue is very annoying. I've tried fixing it once and was unable to come up with a solution without breaking the language flex. It has something to do with the CRLF token I think. I will have to revisit this soon. Thanks for opening an issue for this!

@harm-smits
Copy link
Author

For the record, the file is in LF and the issue persist throughout the different available line separators. Ill see if I can pick up Java and perhaps find the bug, but not quite sure as I have barely touched it in the past years.

@ajkhoury
Copy link
Owner

ajkhoury commented Mar 25, 2020

These issues are ironically not issues with any of the java code, they are issues with my lexical generator and language rules - _NASMLexer.flex and NASM.bnf respectively.

These two files are used to generate the lexer and parser for the language, which is what intellij uses to organize symbols and highlight code.

I am quite incompetent when it comes to writing JFlex and BNF rules, so any help from someone with a better brain than mine would be appreciated!

@harm-smits
Copy link
Author

Ah I see. I did some stuff with lexical analysis in the past. Let me see what I can figure out.

@ajkhoury
Copy link
Owner

ajkhoury commented Mar 26, 2020

I really appreciate it!

I have updated much of the bnf a jflex today. Your macros.asm file should now parse without errors as is, though I was not able to solve this issue yet.

@harm-smits
Copy link
Author

This might be interesting to have a look at: rouge-ruby/rouge#1428

@harm-smits
Copy link
Author

harm-smits commented Mar 26, 2020

As far as I can see, the following in the lex file needs some changing, and we should explore a number of things as well:

  • We should ignore case altogether (just like in nasm);
  • We will have to check for esoteric instruction sets that we might have left out. A good place to start with this is the Intel documentation. Just doing a quick regex will give you all available instructions as of right now.

We need to consider whether we should write a more complicated flex that will define the followed register type that is allowed (e.g. pmovdqu) and the amount of comma separated trailing instructions (because right now stuff like mov rdi, rdi, rdi, rdi, ymm0 is considered valid, which we can then invalidate accordingly. Though this can be quite hard to do with the macros and all.

We should decide whether we want the token type checking to be done in a wrapper class or in the flex file. Both bring advantages, but I feel like it might be easier if we write a wrapper class, as it could reduce the amount of duplicate code when adding autocomplete. This would also make the macro problem in the aforementioned less of a pain as we can just get the defined macros or write a separate pre-processor for the macros which generates a list of possible names. This then in turn could be shared for autocomplete.

@ajkhoury
Copy link
Owner

ajkhoury commented Mar 26, 2020

Ignoring case is something I partially do in the regular expression tokens, completely ignoring case shouldn't be too hard to complete.

Yes there are still a handful of instructions that I have not currently defined which I need to eventually add. I have sort of just been adding instruction sets as I use them in my own personal projects.

The biggest problem I have had so far is writing this plugin so that macro calls can be identified by the lexer so that we can highlight said macros, otherwise everything can just be treated as plain identifier. I sort of already follow the approach that rouge-ruby and chroma take by allowing plain identifiers to be used in place of instructions that aren't already matched as a Mnemonic rule. Enforcing the amount of comma separated trailing registers for a certain mnemonic is a pretty darn near impossible task due to having to support NASM's macro preprocessor. I would like to keep the lexer as least strict as possible for later cases when I plan to add eventual MASM support, and allow entirely different architectures (ARM, MIPS, etc).

@harm-smits
Copy link
Author

Okay, found this other link with all instructions currently supported by nasm, could be interesting as well: https://www.nasm.us/xdoc/2.14rc3/nasmdoc.pdf

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants