Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

includes("**/xmake.lua") 卡死 #5052

Closed
TOMO-CAT opened this issue May 4, 2024 · 60 comments
Closed

includes("**/xmake.lua") 卡死 #5052

TOMO-CAT opened this issue May 4, 2024 · 60 comments
Labels

Comments

@TOMO-CAT
Copy link

TOMO-CAT commented May 4, 2024

Xmake 版本

xmake v2.8.9+dev.50dbca648

操作系统版本和架构

Ubuntu 2204

描述问题

一旦使用 includes("/xmake.lua") 就会导致卡死,但是逐个逐个使用就没有问题。
image
我一开始怀疑是有 xmake.lua 文件也用了 includes("
/xmake.lua") 导致循环解析,所以将当前文件夹下所有的 xmake.lua 挨个添加排查:

image

因为另一台电脑也是用的这个 xmake commit-id 的版本,所以也排除是 xmake 版本的问题。如果可以调试的话应该是很容易发现问题的,但是 emmylua 插件也是在 xmake 进程内运行的,还没走到调试阶段就卡住了:
VFS$_DHS{2YY83P}S60TCBS

期待的结果

想知道卡死的原因,怀疑和本地项目有关系,但是项目太大不方便构造一个最小 demo。或者有没有其他的调试方法。

工程配置

暂无

附加信息和错误日志

无。

@TOMO-CAT TOMO-CAT added the bug label May 4, 2024
@Issues-translate-bot
Copy link

Bot detected the issue body's language is not English, translate it automatically.


Title: includes("**/xmake.lua") stuck

Xmake version

xmake v2.8.9+dev.50dbca648

Operating system version and architecture

Ubuntu 2204

Describe the problem

Once you use includes("/xmake.lua"), it will cause stuck, but there is no problem if you use it one by one.
image
At first, I suspected that there was an xmake.lua file that also used includes("
/xmake.lua"), which caused circular analysis, so I added all xmake.lua files in the current folder one by one to check:

image

Because another computer also uses this xmake commit-id version, the problem with the xmake version is also ruled out. If it can be debugged, it should be easy to find the problem, but the emmylua plug-in also runs in the xmake process, and it gets stuck before reaching the debugging stage:
VFS$_DHS{2YY83P}S60TCBS

Expected results

I want to know the reason for the stuck, and I suspect it has something to do with the local project, but the project is too big to construct a minimal demo. Or are there any other debugging methods?

Project configuration

None yet

Additional information and error logs

none.

@waruqi
Copy link
Member

waruqi commented May 4, 2024

估计是 includes 匹配的里面,又匹配到了带有 includes("**/xmake.lua") 的 xmake.lua 导致死循环了。不过我这没法复现,你只能自己调下。

function interpreter:api_builtin_includes(...)

@Issues-translate-bot
Copy link

Bot detected the issue body's language is not English, translate it automatically.


It is estimated that in the includes match, xmake.lua with includes("**/xmake.lua") is matched, causing an infinite loop. But I can't reproduce it, you can only adjust it yourself.

function interpreter:api_builtin_includes(...)

@TOMO-CAT
Copy link
Author

TOMO-CAT commented May 4, 2024

api_builtin_includes

ok,我打日志看看

@Issues-translate-bot
Copy link

Bot detected the issue body's language is not English, translate it automatically.


api_builtin_includes

OK, I'll check the logs

@waruqi
Copy link
Member

waruqi commented May 9, 2024

解决了么

@Issues-translate-bot
Copy link

Bot detected the issue body's language is not English, translate it automatically.


Is it solved?

@TOMO-CAT
Copy link
Author

TOMO-CAT commented May 9, 2024

解决了么

还没解决,卡在这一行:
image

日志如下:
image

@TOMO-CAT
Copy link
Author

TOMO-CAT commented May 9, 2024

找到了一种调试的方法,不用一直打日志,我先调一下
image

@TOMO-CAT
Copy link
Author

TOMO-CAT commented May 9, 2024

解决了么

到这为止就没法继续往下调试了:

rootdir “.”
pattern ".*/xmake%.lua"
recursion -1
mode 0
excludes nil
callback nil

image

@TOMO-CAT
Copy link
Author

TOMO-CAT commented May 9, 2024

如果到此为止参数都是正确的,那我就重装 xmake 吧,如果参数有问题,那我就接着调试看一下哪一行出的问题

@Issues-translate-bot
Copy link

Bot detected the issue body's language is not English, translate it automatically.


Is it solved?

Not solved yet, stuck on this line:
image

The log is as follows:
image

@Issues-translate-bot
Copy link

Bot detected the issue body's language is not English, translate it automatically.


I found a way to debug without logging all the time. I'll adjust it first.
image

@Issues-translate-bot
Copy link

Bot detected the issue body's language is not English, translate it automatically.


Is it solved?

At this point, there is no way to continue debugging:

rootdir "."
pattern ".*/xmake%.lua"
recursion-1
mode 0
excludes nil
callback nil

image

@Issues-translate-bot
Copy link

Bot detected the issue body's language is not English, translate it automatically.


If the parameters are correct so far, then I will reinstall xmake. If there is a problem with the parameters, then I will continue debugging to see which line is the problem.

@waruqi
Copy link
Member

waruqi commented May 10, 2024

那你这个估计是走 **/ 递归匹配,扫到太多文件了,所以遍历太慢。。你这项目底下多少文件?

你可以设置 callback 看下具体遍历到了哪里。。

local file = os.match("**/xmake.lua", true, function (filepath, isdir)
    print(filepath)
    return true
end)

也可能是你 .git 目录下子文件太多

@TOMO-CAT
Copy link
Author

local file = os.match("**/xmake.lua", true, function (filepath, isdir)
    print(filepath)
    return true
end)

image

还是直接卡住了没打印什么东西,这个项目确实很大,但是 find . -name xmake.lua 很快就返回了。
image

@waruqi
Copy link
Member

waruqi commented May 10, 2024

那你只能继续调下 c 里面的遍历了。。

static tb_long_t xm_os_find_walk(tb_char_t const* path, tb_file_info_t const* info, tb_cpointer_t priv)

拉下 xmake 源码,然后上面地方 通过 printf 或者 tb_trace_i 去 打印下 内部执行,看看遍历到哪了

./configure
make
source ~/scripts/srcenv.profile
xmake build

再不行,就只能调下 tbox 里面的遍历接口了。。https://github.com/tboox/tbox/blob/c6b0a56076941b8263e162c7fe7b0870ea44e09c/src/tbox/platform/posix/directory.c#L270

@TOMO-CAT
Copy link
Author

那你只能继续调下 c 里面的遍历了。。

static tb_long_t xm_os_find_walk(tb_char_t const* path, tb_file_info_t const* info, tb_cpointer_t priv)

拉下 xmake 源码,然后上面地方 通过 printf 或者 tb_trace_i 去 打印下 内部执行,看看遍历到哪了

./configure
make
source ~/scripts/srcenv.profile
xmake build

再不行,就只能调下 tbox 里面的遍历接口了。。https://github.com/tboox/tbox/blob/c6b0a56076941b8263e162c7fe7b0870ea44e09c/src/tbox/platform/posix/directory.c#L270

好的,我再试试,另外这个能通过 includes("/xmake.lua|.git//xmake.lua") 来跳过 .git 下面的遍历吗

@TOMO-CAT
Copy link
Author

TOMO-CAT commented May 10, 2024

也可能是你 .git 目录下子文件太多

xmake 4w 个文件的遍历能抗住吗

image

@waruqi
Copy link
Member

waruqi commented May 10, 2024

includes("**/xmake.lua|.git/**")

@waruqi
Copy link
Member

waruqi commented May 10, 2024

xmake 4w 个文件的遍历能抗住吗

没压测过,但也许会很慢,通常我不会 includes 里面这么干。。

@waruqi
Copy link
Member

waruqi commented May 10, 2024

你可以自己 xmake l os.files "**" 测下。。

@TOMO-CAT
Copy link
Author

卡住了至少得一分钟还没返回。。。
image

这能优化一下性能吗,至少 find 是立马返回了。虽然这个项目是比较大一点,但是 git 项目 5000 个文件以上还是很普遍的

@TOMO-CAT
Copy link
Author

xmake 4w 个文件的遍历能抗住吗

没压测过,但也许会很慢,通常我不会 includes 里面这么干。。

但是这是官方文档的写法,对于大型项目多目录里面都有 xmake.lua 的话挺好用的。
image

@waruqi
Copy link
Member

waruqi commented May 10, 2024

卡住了至少得一分钟还没返回。。。 image

这能优化一下性能吗,至少 find 是立马返回了。虽然这个项目是比较大一点,但是 git 项目 5000 个文件以上还是很普遍的

find 是纯c的,xmake 里每个遍历,内部都要调度 lua 接口交互,还得每次做 lua pattern 模式匹配,多少会影响性能,但避免不了,而且遍历结果还得塞进 lua table 受限于 lua 的资源管理

@waruqi
Copy link
Member

waruqi commented May 10, 2024

但应该也不会太慢,顶多比 find 慢一点点而已,这边 llvm 12w 的文件,find 遍历 2s 。 xmake 里面遍历,算上 lua 模式匹配 + print 回显一些耗时,总共也就 10s 。。你这才 4w,按理顶多 4s 的时间。。。卡住肯定是其他问题,这个就只能你自己调
下 c 层代码了。。

llvm ruki$ time find ./ -type f | wc -l
  126189

real	0m1.946s
user	0m0.145s
sys	0m1.629s
llvm ruki$ time xmake l os.files "**" > /tmp/a

real	0m10.200s
user	0m4.564s
sys	0m5.304s

@TOMO-CAT
Copy link
Author

但应该也不会太慢,顶多比 find 慢一点点而已,这边 llvm 12w 的文件,find 遍历 2s 。 xmake 里面遍历,算上 lua 模式匹配 + print 回显一些耗时,总共也就 10s 。。你这才 4w,按理顶多 4s 的时间。。。卡住肯定是其他问题,这个就只能你自己调 下 c 层代码了。。

llvm ruki$ time find ./ -type f | wc -l
  126189

real	0m1.946s
user	0m0.145s
sys	0m1.629s
llvm ruki$ time xmake l os.files "**" > /tmp/a

real	0m10.200s
user	0m4.564s
sys	0m5.304s

我感觉也是哪里卡死了,打日志调试一下下面的 C 代码看看

@Issues-translate-bot
Copy link

Bot detected the issue body's language is not English, translate it automatically.


Then you can only continue to adjust the traversal in c. .

static tb_long_t xm_os_find_walk(tb_char_t const* path, tb_file_info_t const* info, tb_cpointer_t priv)

Pull down the xmake source code, and then use printf or tb_trace_i to print the internal execution to see where it has been traversed.

./configure
make
source ~/scripts/srcenv.profile
xmake build

If that doesn't work, you can only adjust the traversal interface in tbox. . https://github.com/tboox/tbox/blob/c6b0a56076941b8263e162c7fe7b0870ea44e09c/src/tbox/platform/posix/directory.c#L270

@Issues-translate-bot
Copy link

Bot detected the issue body's language is not English, translate it automatically.


It has been stuck for at least a minute and has not returned. . .
image

Can this optimize performance? At least find returns immediately. Although this project is relatively large, it is still common for git projects to have more than 5,000 files.

@Issues-translate-bot
Copy link

Bot detected the issue body's language is not English, translate it automatically.


Can xmake survive the traversal of 40,000 files?

I haven't tested it under pressure, but it may be very slow. Usually I don't do this in includes. .

But this is how it is written in official documents. It is very useful for large projects if there are xmake.lua in multiple directories.
image

@Issues-translate-bot
Copy link

Bot detected the issue body's language is not English, translate it automatically.


It has been stuck for at least a minute and has not returned. . . ![image](https://private-user-images.githubusercontent.com/47598093/329429374-8ad73b1b-14df-4614-8d8e-d4c22ae24d23.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOi JnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MTUzMDUzMTMsIm5iZiI6MTcxNTMwNTAxMywicGF0aCI6Ii80NzU5ODA5My8z Mjk0MjkzNzQtOGFkNzNiMWItMTRkZi00NjE0LThkOGUtZDRjMjJhZTI0ZDIzLnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFF LNFpBJTJGMjAyNDA1MTAlMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjQwNTEwVDAxMzY1M1omWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPTk2 NGZiNzFkZGRkY2FjOTZjYTNkZmY0ZjQ3MzU2ZWZiZGNjZDM1ZTViZDNjMWM4NzQ1M2E1ZGI5Z DcwYTkzNGMmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0JmFjdG9yX2lkPTAma2V5X2lkPTAmcm Vwb19pZD0wIn0.sLMCX2NxbCOTDmiCEAX0_WO_7B4jGUWt1eTsau1NM18)

Can this optimize performance? At least find returns immediately. Although this project is relatively large, it is still common for git projects to have more than 5,000 files.

find is pure C. Every traversal in xmake requires internal scheduling of Lua interface interaction, and Lua pattern matching has to be done every time. It will affect the performance to some extent, but it cannot be avoided. Moreover, the traversal results have to be stuffed into Lua table, which is limited. Resource management in lua

@Issues-translate-bot
Copy link

Bot detected the issue body's language is not English, translate it automatically.


But it shouldn’t be too slow, at most a little slower than find. For the llvm 12w file here, find traverses 2s. The traversal in xmake, including lua pattern matching + print echo, takes some time, and the total time is only 10s. . You are only 4w, so it should take you 4s at most. . . If it gets stuck, it must be due to other problems. You can only adjust this yourself.
Download the c-level code. .

llvm ruki$ time find ./ -type f | wc -l
  126189

real 0m1.946s
user 0m0.145s
sys 0m1.629s
llvm ruki$ time xmake l os.files "**" > /tmp/a

real 0m10.200s
user 0m4.564s
sys 0m5.304s

@Issues-translate-bot
Copy link

Bot detected the issue body's language is not English, translate it automatically.


But it shouldn’t be too slow, at most a little slower than find. For the llvm 12w file here, find traverses 2s. The traversal in xmake, including lua pattern matching + print echo, takes some time, and the total time is only 10s. . You are only 4w, so it should take you 4s at most. . . If it gets stuck, it must be due to other problems. You can only adjust the c-layer code yourself. .

llvm ruki$ time find ./ -type f | wc -l
126189

real 0m1.946s
user 0m0.145s
sys 0m1.629s
llvm ruki$ time xmake l os.files "**" > /tmp/a

real 0m10.200s
user 0m4.564s
sys 0m5.304s

I feel like it's stuck somewhere. Let's log and debug the C code below.

@waruqi
Copy link
Member

waruqi commented May 10, 2024

而且按你目前的 xmake.lua 结构,可以改成 includes("*/xmake.lua", "*/*/xmake.lua"),限制到单层,双层遍历,应该效率会高不少。。能不递归遍历全部 就不要这么做,尤其是工程很大的情况下。

@Issues-translate-bot
Copy link

Bot detected the issue body's language is not English, translate it automatically.


And according to your current xmake.lua structure, you can change it to includes("*/xmake.lua", "*/*/xmake.lua"), which is limited to single layer and double layer traversal, which should be more efficient. few. . If you can traverse everything recursively, don't do it, especially if the project is large.

@TOMO-CAT
Copy link
Author

而且按你目前的 xmake.lua 结构,可以改成 includes("*/xmake.lua", "*/*/xmake.lua"),限制到单层,双层遍历,应该效率会高不少。。能不递归遍历全部 就不要这么做,尤其是工程很大的情况下。

挨个试了一下,删掉 blade-bin 文件夹就可以了,暂时这么写绕过去,但是还是不懂为什么挂在 C++ 函数里:

includes("**/xmake.lua|blade-bin/**/xmake.lua")

@Issues-translate-bot
Copy link

Bot detected the issue body's language is not English, translate it automatically.


And according to your current xmake.lua structure, you can change it to includes("*/xmake.lua", "*/*/xmake.lua"), which is limited to single layer and double layer traversal, which should be more efficient. Quite a few. . If you can traverse everything recursively, don't do it, especially if the project is large.

I tried them one by one and deleted the blade-bin folder. I wrote this to bypass it for the time being, but I still don’t understand why it hangs in the C++ function:

includes("**/xmake.lua|blade-bin/**/xmake.lua")

@waruqi
Copy link
Member

waruqi commented May 26, 2024

这个只有你这调下 c 层代码才行

@Issues-translate-bot
Copy link

Bot detected the issue body's language is not English, translate it automatically.


This can only be done if you adjust the c-layer code

@Yangff
Copy link

Yangff commented May 28, 2024

我猜你的这个

删掉 blade-bin 文件夹就可以了

文件夹里面有到上层的软链。

image
image
这样就能卡了

@Issues-translate-bot
Copy link

Bot detected the issue body's language is not English, translate it automatically.


I guess this is yours

Just delete the blade-bin folder

There is a soft link to the upper level inside the folder.

image
image
This way it will get stuck

@TOMO-CAT
Copy link
Author

删掉 blade-bin 文件夹就可以了

确实,bazel 和 blade 都会生成软链。

@Issues-translate-bot
Copy link

Bot detected the issue body's language is not English, translate it automatically.


Just delete the blade-bin folder

Indeed, both bazel and blade generate soft links.

@TOMO-CAT TOMO-CAT reopened this Jun 4, 2024
@TOMO-CAT
Copy link
Author

TOMO-CAT commented Jun 4, 2024

话说这个有软链(即使软链里面没有 xmake.lua 文件)会卡死的问题算 bug 吗

@Issues-translate-bot
Copy link

Bot detected the issue body's language is not English, translate it automatically.


By the way, is the problem of getting stuck if there is a soft link (even if there is no xmake.lua file in the soft link) considered a bug?

@waruqi
Copy link
Member

waruqi commented Jun 4, 2024

话说这个有软链(即使软链里面没有 xmake.lua 文件)会卡死的问题算 bug 吗

估计是软链到上层导致递归死循环遍历了,这种目前也区分不了

@Issues-translate-bot
Copy link

Bot detected the issue body's language is not English, translate it automatically.


By the way, is the problem of getting stuck if there is a soft link (even if there is no xmake.lua file in the soft link) considered a bug?

It is estimated that the soft link to the upper layer caused a recursive infinite loop traversal.

@TOMO-CAT
Copy link
Author

TOMO-CAT commented Jun 4, 2024

话说这个有软链(即使软链里面没有 xmake.lua 文件)会卡死的问题算 bug 吗

估计是软链到上层导致递归死循环遍历了,这种目前也区分不了

好的,我看看 bazel 都生成了些啥玩意

@Issues-translate-bot
Copy link

Bot detected the issue body's language is not English, translate it automatically.


By the way, is the problem of getting stuck if there is a soft link (even if there is no xmake.lua file in the soft link) considered a bug?

It is estimated that the soft link reaches the upper layer, causing a recursive infinite loop traversal. This kind of thing cannot be distinguished at the moment.

Okay, let me see what bazel has generated.

@TOMO-CAT
Copy link
Author

TOMO-CAT commented Jun 4, 2024

话说这个有软链(即使软链里面没有 xmake.lua 文件)会卡死的问题算 bug 吗

估计是软链到上层导致递归死循环遍历了,这种目前也区分不了

不过这里检测到循环抛出来一个 warning 或者异常是不是更友好一点,之前我个人的体验是直接卡死了,没有调试经验的话完全不知道发生了什么

@Issues-translate-bot
Copy link

Bot detected the issue body's language is not English, translate it automatically.


By the way, is the problem of getting stuck if there is a soft link (even if there is no xmake.lua file in the soft link) considered a bug?

It is estimated that the soft link reaches the upper layer, causing a recursive infinite loop traversal. This kind of thing cannot be distinguished at the moment.

However, if a warning or exception is detected here and the loop is thrown, would it be more friendly? My personal experience before was that it was directly stuck. If I have no debugging experience, I have no idea what happened.

@waruqi
Copy link
Member

waruqi commented Jun 4, 2024

话说这个有软链(即使软链里面没有 xmake.lua 文件)会卡死的问题算 bug 吗

估计是软链到上层导致递归死循环遍历了,这种目前也区分不了

不过这里检测到循环抛出来一个 warning 或者异常是不是更友好一点,之前我个人的体验是直接卡死了,没有调试经验的话完全不知道发生了什么

检测这种就需要通过 map 维护已经遍历过的路径状态,影响遍历性能,尤其是遍历大规模文件时候,还影响内存。。而且这种case 出现概率太低。。为了检测这种,去做更复杂的处理,没必要

@Issues-translate-bot
Copy link

Bot detected the issue body's language is not English, translate it automatically.


If there is a soft link (even if there is no xmake.lua file in the soft link), it will get stuck. Is this a bug?

It is estimated that the soft link reaches the upper layer, causing a recursive infinite loop traversal. This kind of thing cannot be distinguished at the moment.

However, if a warning or exception is detected here and a loop is thrown, is it more friendly? My personal experience before was that it was directly stuck. If I have no debugging experience, I have no idea what happened.

Detecting this requires maintaining the traversed path status through map, which affects traversal performance, especially when traversing large-scale files, and also affects memory. . And the probability of this case happening is too low. . In order to detect this, it is not necessary to do more complex processing

@TOMO-CAT
Copy link
Author

TOMO-CAT commented Jun 4, 2024

检测这种就需要通过 map 维护已经遍历过的路径状态,影响遍历性能,尤其是遍历大规模文件时候,还影响内存。。而且这种case 出现概率太低。。为了检测这种,去做更复杂的处理,没必要

okok,那先维持现状吧

@TOMO-CAT TOMO-CAT closed this as completed Jun 4, 2024
@Issues-translate-bot
Copy link

Bot detected the issue body's language is not English, translate it automatically.


Detecting this requires maintaining the traversed path status through map, which affects traversal performance, especially when traversing large-scale files, and also affects memory. . And the probability of this case happening is too low. . In order to detect this, it is not necessary to do more complex processing

Okok, let’s maintain the status quo for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants