Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lua: function declared inside an object is not tagged correctly #1798

Open
doronbehar opened this issue Jul 17, 2018 · 7 comments
Open

lua: function declared inside an object is not tagged correctly #1798

doronbehar opened this issue Jul 17, 2018 · 7 comments

Comments

@doronbehar
Copy link

doronbehar commented Jul 17, 2018

The name of the parser: lua
The command line you used to run ctags:

$ ctags --options=NONE test.lua

The content of input file:

myVar = {
	obj = 12,
	myMethod = function()
	end,
	str = "bafsg"
}

return myVar

The tags output you are not satisfied with:

!_TAG_FILE_FORMAT	2	/extended format; --format=1 will not append ;" to lines/
!_TAG_FILE_SORTED	1	/0=unsorted, 1=sorted, 2=foldcase/
!_TAG_OUTPUT_MODE	u-ctags	/u-ctags or e-ctags/
!_TAG_PROGRAM_AUTHOR	Universal Ctags Team	//
!_TAG_PROGRAM_NAME	Universal Ctags	/Derived from Exuberant Ctags/
!_TAG_PROGRAM_URL	https://ctags.io/	/official site/
!_TAG_PROGRAM_VERSION	0.0.0	/5a4b6d04/
myMethod	test.lua	/^	myMethod = function()$/;"	f

The tags output you expect:

!__ COMMENTS __!
myVar:myMethod	test.lua	/^	myMethod = function()$/;"	f

The version of ctags:

$ ctags --version
Universal Ctags 0.0.0(5a4b6d04), Copyright (C) 2015 Universal Ctags Team
Universal Ctags is derived from Exuberant Ctags.
Exuberant Ctags 5.8, Copyright (C) 1996-2009 Darren Hiebert
  Compiled: Jul 11 2018, 12:14:12
  URL: https://ctags.io/
  Optional compiled features: +wildcards, +regex, +iconv, +option-directory, +xpath, +json, +interactive, +sandbox, +yaml

ctags was build from source from this repository.

@masatake
Copy link
Member

Thank you for contacting.

The lua parser is incomplete to implement what you want quickly. For example, it doesn't capture
a variable like myVar in your example.
The current implementation is line oriented. It must be rewritten in token oriented style.
Rewriting it is a small thing.

Difficulties are in what kind of output do we want for a dynamic language like lua.
JavaScript looks similar to lua for me.
In addition, the JavaScript parser of ctags is written by @b4n, an experienced developer.
So I would like to reuse the output design used in the JavaScript parser.
(I hope you may know the syntax of JavaScript.)

Js input:

[jet@localhost]/tmp% cat foo.js 
var x = {
    slot: {
	1: function () {},
	myMethod: function () {},
	type: ["func"],
    }
}

ctags output:

[jet@localhost]/tmp% u-ctags --fields=+K -o - /tmp/foo.js
1	/tmp/foo.js	/^	1: function () {},$/;"	method	class:x.slot
myMethod	/tmp/foo.js	/^	myMethod: function () {},$/;"	method	class:x.slot
slot	/tmp/foo.js	/^    slot: {$/;"	class	class:x
type	/tmp/foo.js	/^	type: ["func"],$/;"	property	class:x.slot
x	/tmp/foo.js	/^var x = {$/;"	class

Capturing 1 as a method is a bit storage for me. However, it is understandable and consistent with the other items.

I tried the same thing in lua:

x = {
  slot = {
       [1] = function ()
       end,
       myMethod = function ()
       end,
       type = { "func" },
  }
}

If I rewrite the lua parser, the parser may print:

1	/tmp/foo.lua	/^	[1] = function ()$/;"	method	class:x.slot
myMethod	/tmp/foo.lua	/^	myMethod = function ()$/;"	method	class:x.slot
slot	/tmp/foo.lua	/^    slot = {$/;"	class	class:x
type	/tmp/foo.lua	/^	type = {"func"},$/;"	property	class:x.slot
x	/tmp/foo.lua	/^x = {$/;"	class

How do you think about this output?
I used class but table is better?
I used method but function is better?
I used property but key or something is better?

Another question is : that combines myVar and myMethod in myVar:myMethod.
I wonder why it is not myVar.myMethod. When writing a lua program, the programmer knows which combinator(?) s/he should use. What kind of rules should ctags apply when combining names?

Solving the above things I can work on the original issue you reported.
When --extras=+q, ctags can emit combined names like myVar.myMethod (or myVar:myMethod).

@doronbehar
Copy link
Author

Thanks for your reply, I hope it will be easy enough to rewrite the lua parser as you say.

First of all, the thing that is most important for me to point out in this discussion, is that ctags should write in the tags file (with my example) myVar:myMethod and not just myMethod. That's because the function myMethod will be called with myVar:myMethod() in other locations. Otherwise text editor would have no idea myMethod when called like this - myVar:myMethod() is defined under myVar and not under any other table which my have myMethod() defined for it as well.
In other words, what if we'd have this test.lua:

myVar = {
  obj = 12,
  myMethod = function()
  end,
  str = "bafsg"
}

myOtherVar = {
  obj = 10,
  myMethod = function(a, b)
    return a, b
  end,
  str = "aaadsgsg"
}

Currently it produces the following tags:

myMethod	test.lua	/^	myMethod = function()$/;"	f
myMethod	test.lua	/^	myMethod = function(a, b)$/;"	f

Which gives text editors several locations to look for when searching for the definition of myVar:myMethod() or myOtherVar:myMethod(arg1, arg2).

As for the tags file types' naming dilemmas, key is better then property and table (IMO) is better class. But as for the function vs method issue, what if we'd put both of them in the tags file? Here is my proposal:

In Lua, you can declare a function inside a table using both : and . but when using :, self is passed as an argument to it (source: https://stackoverflow.com/q/4911186/4935114).
Therefor, for the following test.lua file:

x = {
  foo = function(a,b)
    return a
  end,
  bar = function(a,b)
    return b
  end
}

I'd consider the following tags file the best:

x.foo	test.lua	/^	foo = function(a,b)$/;"	f
x:foo	test.lua	/^	foo = function(a,b)$/;"	f
x.bar	test.lua	/^	bar = function(a,b)$/;"	f
x:bar	test.lua	/^	bar = function(a,b)$/;"	f

In addition, there is another issue which perplexes me:
What if we have a file foo.lua like in my original example:

myVar = {
  obj = 12,
  myMethod = function()
  end,
  str = "bafsg"
}

return myVar

And it is being loaded in another lua file with foo = require('foo'). This will make import the function myMethod as foo:myMethod. Then, when the user tries to find through his text editor the definition of foo:myMethod, he actually has to look for the definition of myVAr:myMethod since this is how it was defined in the original file. Should we leave this burden to text editor's plugin writers or should we actually put this in the tags file:

foo:myMethod	foo.lua	/^	myMethod = function()$/;"	f

(Since the basename of the file is foo)
Instead of this:

myVar:myMethod	foo.lua	/^	myMethod = function()$/;"	f

?

@eliasdaler
Copy link

I'm very interested in improvement of Lua's parser. There's a lot of things to consider, but it looks like it should kinda work like JS/Python/Ruby parsers which are pretty good, as far as I know.

@masatake, what would be a good starting point for improving the parser?

@masatake
Copy link
Member

masatake commented Jul 21, 2018

@eliasdaler, thank you!

@masatake, what would be a good starting point for improving the parser?

  • Please, read "TAG ENTRIES" in man/ctags.1.rst.in. I would like to understand the concepts, "kind" and "definition".
  • As I wrote in the comment of this issue, the JavaScript parser in ctags may be helpful for designing kinds for Lua parser.
$ ./ctags --list-kinds-full=JavaScript
C       constant  yes     no      0      NONE   constants
c       class     yes     no      0      NONE   classes
f       function  yes     no      0      NONE   functions
g       generator yes     no      0      NONE   generators
m       method    yes     no      0      NONE   methods
p       property  yes     no      0      NONE   properties
v       variable  yes     no      0      NONE   global variables

Give variety input to the JavaScript parser, and see the output.

  • The implementation of JavaScript parser (parser/jscript.c) may help you. However, it doesn't use advanced APIs (cork and tokeninfo). I guess you may want to use the APIs, especially cork.

  • Python parser uses the cork API. The cork API may help you to solve this issue. The cork API
    may help you to handle scopes of the target language. See also http://docs.ctags.io/en/latest/internal.html?highlight=cork#cork-api .

  • Tcl parser(parsers/tcl.c) uses tokeninfo API. Till I introduce the tokeninfo API, each parser writes similar code to record and handle tokens. I studied these existing codes, and write a new one for making it reusable. That is the tokeninfo API.

  • Current implementation of the lua parser is line oriented. You may have to switch it to token oriented.
    main/tokeninfo.h may help you to write a token-oriented parser.

  • I expect you to add much test cases for Lua parser. See http://docs.ctags.io/en/latest/units.html .

  • Universal-ctags developers assume Universal-ctags is rather low-level tool. It should provide much raw information to people working on client tools like vim to do interesting things. I will make more comments about this item.

@masatake
Copy link
Member

@doronbehar, thank you.

Ctags is not an interpreter nor compiler. So I think tagging foo:myMethod is overdoing for ctags.
I guess the way tracking the assignment is not obvious.

x = myVar

x:myMethod must be captured.

Instead, just tagging myMethod gives chance to an editor to jump to the definition. In that case, the user of the editor must choose the one as you wrote. However, ctags can provide hits to the user for choosing a proper one from candidates.

tags file can have fields names scope.
the output of current implementation:

myMethod	test.lua	/^	myMethod = function()$/;"	f

By extending or rewriting the lua parser, ctags can emit:

myMethod	/tmp/foo.lua	/^	myMethod = function ()$/;"	f	table:myVar

See table:myVar at the end of the line. This is the scope field.
This helps you the user.

As I wrote to @eliasdaler, providing low-level information to a client tool like an editor is the mission of ctags though some parsers violate this principle.

However, I know tag entries, myVar:myMethod and myVar.myMehotd are useful though they are not must.
In that case --extras=+q.

Following example shows how --extras=+q works:

[jet@localhost]~/var/ctags% cat /tmp/foo.cpp
class point {
  int x, y;
  int distanceFromOrigin(void);
};
[jet@localhost]~/var/ctags% ./ctags --kinds-C++=+p -o - /tmp/foo.cpp
distanceFromOrigin	/tmp/foo.cpp	/^  int distanceFromOrigin(void);$/;"	p	class:point	typeref:typename:int	file:
point	/tmp/foo.cpp	/^class point {$/;"	c	file:
x	/tmp/foo.cpp	/^  int x, y;$/;"	m	class:point	typeref:typename:int	file:
y	/tmp/foo.cpp	/^  int x, y;$/;"	m	class:point	typeref:typename:int	file:
[jet@localhost]~/var/ctags% ./ctags --kinds-C++=+p --extras=+q -o - /tmp/foo.cpp
distanceFromOrigin	/tmp/foo.cpp	/^  int distanceFromOrigin(void);$/;"	p	class:point	typeref:typename:int	file:
point	/tmp/foo.cpp	/^class point {$/;"	c	file:
point::distanceFromOrigin	/tmp/foo.cpp	/^  int distanceFromOrigin(void);$/;"	p	class:point	typeref:typename:int	file:
point::x	/tmp/foo.cpp	/^  int x, y;$/;"	m	class:point	typeref:typename:int	file:
point::y	/tmp/foo.cpp	/^  int x, y;$/;"	m	class:point	typeref:typename:int	file:
x	/tmp/foo.cpp	/^  int x, y;$/;"	m	class:point	typeref:typename:int	file:
y	/tmp/foo.cpp	/^  int x, y;$/;"	m	class:point	typeref:typename:int	file: 

signature field can provide hits to the user, too.

[jet@localhost]~/var/ctags% cat /tmp/foo.js
function f (a,b,c) {
    
}
[jet@localhost]~/var/ctags% u-ctags --fields=+S  -o - /tmp/foo.js
f	/tmp/foo.js	/^function f (a,b,c) {$/;"	f	signature:(a,b,c)

@doronbehar
Copy link
Author

@masatake, that's a great suggestion, using something like table:myVar is far better then signature yet perhaps it could be nice to have them both there.

It is clear to me that the text editor should be prepared to read the tags file and know the file type inorder to jump accordingly to the correct definition etc.
Having such hints at the end of the line would be a great start for someone who'll write a Lua mode plugin for text editors.

@jinleileiking
Copy link

wait for complete. If the plugin can write by golang, I want to contribute.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants