Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Doc strings #134

Open
j14159 opened this issue Feb 8, 2017 · 19 comments
Open

Doc strings #134

j14159 opened this issue Feb 8, 2017 · 19 comments

Comments

@j14159
Copy link
Collaborator

j14159 commented Feb 8, 2017

We need to be able to (eventually) output docs for modules (e.g. javadoc, godoc, etc). I'm fine with something like

The block comment immediately preceding the first definition of a function (or its type specification) is used as the documentation for it. The block comment immediately preceding the module declaration is used as the overall module documentation.

And maybe use markdown for formatting. Open questions to me are:

  • listing parameter names and descriptions. I like this about javadoc and would prefer to have it in Alpaca as well.
  • code blocks in triple back-ticks or drop in a little <code>...</code> set of tags?
  • referring to variables with @, typer checks to make sure the references are to good variables but this might depend a bit on what we decide with Type specifications for top-level bindings #133.

Would love more ideas/opinions on this.

@yurrriq
Copy link
Contributor

yurrriq commented Feb 12, 2017

I really like how Elm and Idris handle docstrings, especially the idrisexample syntax and markdown support.

@OvermindDL1
Copy link

I really like how Elm and Idris handle docstrings, especially the idrisexample syntax and markdown support.

Or look at Elixir, it has @doc strings that are markdown, can contain doctests (actual test code that is run at test time), etc...

@arpunk
Copy link
Contributor

arpunk commented Feb 13, 2017

I agree with @OvermindDL1, LFE also has docstrings (but we don't support doctests though)

@yurrriq
Copy link
Contributor

yurrriq commented Feb 13, 2017

I'm weary of heredocs encouraging overly verbose module files, but I suppose support does not equate to encouragement of verbosity. Maybe in a user guide or similar we could advocate against giant in-module docs. FWIW I recommend using the EDoc format internally for maximum compatibility. I was experimenting with that in lodox but have been tied up with other things lately.

@arpunk
Copy link
Contributor

arpunk commented Feb 13, 2017

@yurrriq is right, EDoc format should be used to ensure compatibility. Also keep in mind that documentation for alpaca code is probably going to be published in hexdocs.pm

Regarding verbose module files filled with documentation I think there are pros/cons. In Elixir there are modules that contain 5 lines of code and 50 lines of documentation, but that plays nice once you render the docs. I also find it useful to have some examples available.

@yurrriq
Copy link
Contributor

yurrriq commented Feb 13, 2017

An agreeable solution to me would be to add support for collapsing docstrings in alpaca-mode.

@j14159
Copy link
Collaborator Author

j14159 commented Feb 19, 2017

I would generally prefer block comments for docs rather than heredoc strings, I think it keeps things a little cleaner in general and I believe is still amenable to collapsing in alpaca-mode. Additionally doctests are something I'm a bit jealous of in both Rust and Elixir - I'd definitely like for code blocks in a doc comment/string to be type-checked and run as tests as well. I do like parts of EDoc (e.g. @param as in javadoc as well) but I suspect it will be limiting to adopt it entirely. Does Elixir just output edoc-compatible stuff or something different?

Something I'd like to consider - but have no idea how to go about it yet - is multi-lingual docs or some relatively simple way to internationalize them/support simpler translation efforts.

@arkgil
Copy link

arkgil commented Mar 29, 2017

Could someone explain to me need for compatibility with EDoc? Elixir's ExDoc outputs HTML files, in my opinion produced documentation is more readable and comprehensible than one produced by EDoc. Anyway, I'm wondering what are the advantages.

FWIW I'd opt for markdown formatting too 🙂

EDIT: I've just realised it might be for generating docs of entire project, written in both Erlang and Alpaca.

@arkgil
Copy link

arkgil commented Mar 29, 2017

Another interesting idea would be to embed documentation into compiled files (which is done by Elixir compiler), which would make docs accessible from within the repl (which is awesome). It's probably really long term idea, but might be worth considering.

@yurrriq
Copy link
Contributor

yurrriq commented Mar 29, 2017

@arkgil: I'm advocating using EDoc as the internal format. From that you can generate whatever HTML you want. I agree the default output is not so desirable.. The point of using that format is that we'd be completely compatible with existing Erlang docs and tooling, which would make, e.g. generating polyglot docs a breeze.

@j14159: Elixir uses its own doc records, something like #elixir_doc_v1{} which are not (immediately) compatible with EDoc. They then embed the docs in a chunk of the compiled .beam. We do something similar in LFE.

I suppose another approach would be to use a custom #alpaca_doc_v1{}-filled chunk and then write code to translate that to/from EDoc format. Perhaps then we can have our cake and eat it too.

@arkgil: Embedding docs in beam chunks is simple, so I'd say that's short-term long-hanging fruit. 😄

@yurrriq
Copy link
Contributor

yurrriq commented Mar 29, 2017

PS One issue I ran into when working toward translating LFE docs to EDoc format was that we deal with docs on an application/module level (as in, we know about sibling modules in the parent app, etc) whereas EDoc seems to go about it on a per-file basis.

@yurrriq
Copy link
Contributor

yurrriq commented Mar 29, 2017

Potentially informative references:

@arkgil
Copy link

arkgil commented Mar 29, 2017

Thanks @yurrriq ! I wasn't aware it's simple, thank you for the references 🙂

@josevalim
Copy link

For reference, after Erlang/OTP 20 is out, I will send a proposal to OTP and the community to unify how documentation is shared across BEAM languages. The goal is to have a BEAM chunk that stores the documentation with some metadata. For example, you could think of a "Docs" chunk as a list of tuples where each entry looks like this (written in Erlang):

{{foo, 1}, <<"this is the documentation">>, [{line, 13}, ...]}

Somewhere we also need to store the format of the documentation (markdown, edoc, etc).

This chunk should replace the "LDoc" used by LFE and the "ExDc" chunk used by Elixir. If we can agree on the same chunk format, then we can make tools like ExDoc generate documentation regardless of the language. Functions for accessing documentation in the shell should also work across languages, regardless if they were written in Elixir, Alpaca or LFE.

The only issue is the documentation format. If Elixir docs are written in Markdown and a language does not know how to parse Markdown, then they will have to choose to either not show the Elixir docs or show them in a raw format (i.e. in Markdown). This works fine for Markdown but it would likely be painful if your documentation is written in XML. However, regardless of the format, if this new chunk is accepted, all languages should be capable of showing docs for themselves and at least Erlang.

This new chunk is almost fully orthogonal to this discussion except when it comes to storage. Elixir (and possibly LFE - please correct me @yurrriq) store the documentation the BEAM chunk at compilation time. This is easy in Elixir because the documentation is an annotation:

@doc "says hello world"
def hello_world do
  IO.puts "hello world"
end

However, if you write the documentation in the same syntax as code comments, converting them into docs may be more complicated. If you want to do it during compilation time, you will need to parse the comments out in the alpaca parser. Otherwise you will need an explicit step which takes the documentation and embeds it into the .beam file (which will likely be the approach that needs to be taken in Erlang itself - since a lot of the documentation is in a separate XML file).

A third approach is the one done by Rust, which is a mix of both solutions. In Rust they use the code comments syntax plus an extra token to mark it as documentation. So in Rust:

/// Hello world
fn main ...

becomes:

#![doc = "Hello world"]
fn main ...

So in Rust you use the code comments syntax but they end-up embedded as an annotation, quite similar to Elixir.

@erszcz
Copy link
Contributor

erszcz commented Jun 16, 2017

FYI, I am working on a documentation viewer and extractor for the Erlang shell - docsh. This is a free time effort, though, so the pace is rather slow.

@josevalim's idea of a common chunk format for all BEAM languages is definitely the way to go - let's see how the OTP team responds. It would solve the problem of accessing docs across languages without hacks like this one to access docsh-generated documentation from IEx (where :beam_doc_provider is docsh_iex). Without a common format, we would need such a doc provider for each of ExDc, LDoc, Alpaca doc chunk, etc. On the other hand, the providers would give language authors freedom to choose the documentation format.

Regarding the format, I do not second @yurrriq's idea to make EDoc the default. My personal impression is that the leeway in its design (or evolution?) led to too many similar-yet-different @-tags, which end up not being used consistently. The most prominent examples are @type and @spec, now redundant due to the corresponding attributes. Moreover, even OTP itself is not documented completely in EDoc, but mostly in out of source XML. A lightweight Markdown based approach (e.g. Elixir's) seems much more convenient.

@j14159
Copy link
Collaborator Author

j14159 commented Jun 16, 2017

I like the BEAM chunk idea that @josevalim is suggesting, seems in line with the AST chunk work lately as well. I don't really have a problem with producing edoc-compatible stuff (or leaving that to a tool or some sort of plugin) but the issue with external docs in XML raised by @erszcz makes me second-guess the utility of it.

I'm pretty firm on wanting things like doc-tests but have discussed this with others (h/t @talentdeficit for ideas and inspiration) and have thoughts about linking or tagging tests in docs for outputting as examples in-line later rather than requiring in-line up front.

Things I don't know how to handle yet include but are not limited to:

  • validating docs against code, e.g. types in docs have to agree with the code, or argument lists, etc. I don't know how this should work yet but I'd like to have it.
  • documentation for different function heads. E.g. should we interleave as examples?

I'm also still pretty keen on markdown for docs in block comments but I think we'll need some sort of annotations in them to make them really useful.

@josevalim
Copy link

I'm also still pretty keen on markdown for docs in block comments but I think we'll need some sort of annotations in them to make them really useful.

FWIW, Elixir has a single "extension" to markdown which is using backticks to provide autolinks. When generating the HTML documentation, Foo.bar/3 will automatically link to the function bar with arity 3 in module Foo. c:Foo.bar/3 links to a callback in the module Foo with name bar and arity 3. t:Foo.bar/2 links to the type bar/2 in Foo. Foo links to a module named Foo. bar/3 links to a local function named bar with arity 3.

documentation for different function heads

In Erlang, function heads are an implementation detail. I am not sure how much that holds in Alpaca since it is statically typed language. But imagine that you decided to change the implementation of a function and move the different clauses to a private function, would this force you to rewrite the docs? If so, is that a good thing or a bad thing? A lot of people object in writing documentation along the source code exactly because you may end-up coupling the two, so it is worth considering how and when code changes should affect the docs or not.

@OvermindDL1
Copy link

I'd honestly say that the 'format' of the doc (and thus you could embed multiple formats too in the BEAM files) should be a mimetype or some specific equivalent there-of.

@lepoetemaudit
Copy link
Contributor

In the short term, if we parse the docstrings and store in the AST, we can just write it out when we write out the type information in the AST into the module attributes. It would be easy enough to walk that structure to extract the comments straight from the compiled module. It is a little ugly doing it like this so having a dedicated chunk in the generated BEAM would be great if that gets adopted.

I'm broadly in favour of markdown and the simple approach of Elixir with autolinking backticks as @josevalim describes - we'll already have type information, whether inferred or specified via type signatures, so I don't know if we'd need specific kind of annotations in docstrings other than simple hotlinks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants