Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

starttls issue (unable to update event disposition: No such file or directory) #247

Open
luveti opened this issue Mar 26, 2021 · 4 comments

Comments

@luveti
Copy link

luveti commented Mar 26, 2021

This is an example from the documentation, with a snippet from daurnimator's lua-http library, and is the minimal code needed to reproduce the issue I'm seeing in my application.

local ce = require "cqueues.errno"
local cqueues = require "cqueues"
local socket = require "cqueues.socket"

local cq = cqueues.new()

-- copied from https://git.io/JYqpM
local function onerror(socket, op, why, lvl)
	local err = string.format("%s: %s", op, ce.strerror(why))
	if op == "starttls" then
		local ssl = socket:checktls()
		if ssl and ssl.getVerifyResult then
			local code, msg = ssl:getVerifyResult()
			if code ~= 0 then
				err = err .. ":" .. msg
			end
		end
	end
	if why == ce.ETIMEDOUT then
		if op == "fill" or op == "read" then
			socket:clearerr("r")
		elseif op == "flush" then
			socket:clearerr("w")
		end
	end
	return err, why
end

local function send_request()
	local http = socket.connect("google.com", 443)
	http:onerror(onerror)
	local ok, err, errno = http:starttls()
	if not ok then
		-- Note: calling http:close() here causes a different error to occur (Bad file descriptor)
		return nil, err, errno
	end
	http:write("GET / HTTP/1.0\n")
	http:write("Host: google.com:443\n\n")

	local status = http:read()
	print("!", status)
	for ln in http:lines "*h" do
		print("|", ln)
	end

	local empty = http:read "*L"
	print "~"

	for ln in http:lines "*L" do
		io.stdout:write(ln)
	end
	http:close()
end

cq:wrap(function()
	while true do
		print(send_request())
		cqueues.sleep(0.5)
	end
end)

cq:wrap(function()
	while true do
		print(send_request())
		cqueues.sleep(1)
	end
end)

print(cq:loop())

Note: The contents of onerror don't affect this issue, but give a nice error message.

Output:

nil     starttls: Network is unreachable        101
nil     starttls: Network is unreachable        101
nil     starttls: Network is unreachable        101
nil     starttls: Network is unreachable        101
nil     starttls: Network is unreachable        101
nil     starttls: Network is unreachable        101
nil     starttls: Network is unreachable        101
nil     starttls: Network is unreachable        101
nil     starttls: Network is unreachable        101
nil     starttls: Network is unreachable        101
nil     starttls: Network is unreachable        101
false   unable to update event disposition: No such file or directory (fd:28)   2       thread: 0xb6d75648      nil     28

The count of starttls lines varies, from one to about a dozen. Which leads me to believe something is getting garbage collected and a reference to the garbage collected object is being used?

@daurnimator
Copy link
Collaborator

This code you linked works for me: it repeatedly (successfully) makes requests to google.com. Is there some other ingredient I need to reproduce?

@luveti
Copy link
Author

luveti commented Mar 29, 2021

Hey @daurnimator, looks like I forgot to mention that the device is sitting behind a captive portal, which hasn't been "signed into" yet. I set up a raspberry pi 3 mobile b+ to be a router using RaspAP and Nodogsplash.

I should also mention that I'm running cqueues on a raspberry pi 4 model b (as part of a much larger project). I'm using the latest version of cqueues.

If you need hardware (or funds for some) we would be more than willing to donate. Trying to setup a captive portal on an old router is much harder than doing so on a raspberry pi!

@luveti
Copy link
Author

luveti commented Mar 31, 2021

As a temporary work around, I ended up moving all my HTTP requests into threads (using a task queue I've had for a while). This worked pretty well for a while, as I could just let the thread die when the mentioned error occurred.

But after letting this run for a while I started to get "Too many open files" errors from various other places in my program. I ran the following lsof -c luajit | wc -l and noticed luajit was opening more and more files over time. I was able to reproduce this in the above example.

@luveti
Copy link
Author

luveti commented Apr 4, 2021

I've been poking around at the internals of cqueues and I'm starting to think there may be a leak somewhere in the dns logic under so_open. If I pass an IP address into socket.connect the issue I've described goes away. Using dns.resolve to resolve the domain name doesn't appear to cause a leak. So an example that works:

local ce = require('cqueues.errno')
local cqueues = require('cqueues')
local dns = require('cqueues.dns')
local packet = require('cqueues.dns.packet')
local record = require('cqueues.dns.record')
local socket = require('cqueues.socket')

local cq = cqueues.new()

local function onerror(socket, op, why, lvl) -- luacheck: ignore 212
	local err = string.format("%s: %s", op, ce.strerror(why))
	if op == "starttls" then
		local ssl = socket:checktls()
		if ssl and ssl.getVerifyResult then
			local code, msg = ssl:getVerifyResult()
			if code ~= 0 then
				err = err .. ":" .. msg
			end
		end
	end
	if why == ce.ETIMEDOUT then
		if op == "fill" or op == "read" then
			socket:clearerr("r")
		elseif op == "flush" then
			socket:clearerr("w")
		end
	end
	return err, why
end

local function domain_to_ip_address(domain)
	local p, err_code = dns.query(domain, 'A')
	if not p then return nil, err_code end
	for r in p:grep({ section = packet.section.ANSWER, type = record.type.A }) do
		return r:addr()
	end
end

local function send_request()
	local ip, err = domain_to_ip_address('google.com')
	if not ip then
		print('failed to resolve domain name', err)
		return
	end

	local http = socket.connect(ip, 443)
	http:onerror(onerror)
	local ok, err, errno = http:starttls()
	if not ok then
		return nil, err, errno
	end
	http:write("GET / HTTP/1.0\n")
	http:write("Host: google.com:443\n\n")

	local status = http:read()
	print("!", status)
	for ln in http:lines "*h" do
		print("|", ln)
	end

	local empty = http:read "*L"
	print "~"

	for ln in http:lines "*L" do
		io.stdout:write(ln)
	end
	http:close()
end

cq:wrap(function()
	while true do
		print(pcall(function()
			print(send_request())
		end))
		cqueues.sleep(1)
	end
end)

print(cq:loop())

While poking around, I noticed defining SOCKET_DEBUG outputs some useful info. It appears socket.connect attempts to connect to both IPv4 and IPv6 addresses using the same file descriptor:

fd = 6
connect(google.com./[172.217.9.78]:443): Connection refused
fd = 6
connect(google.com./[2607:f8b0:4009:816::200e]:443): Network is unreachable
nil     starttls: Network is unreachable        101
true
fd = 6
connect(google.com./[172.217.9.78]:443): Connection refused
fd = 6
connect(google.com./[2607:f8b0:4009:816::200e]:443): Network is unreachable
nil     starttls: Network is unreachable        101
true
fd = 6
connect(google.com./[172.217.9.78]:443): Connection refused
fd = 6
connect(google.com./[2607:f8b0:4009:816::200e]:443): Network is unreachable
nil     starttls: Network is unreachable        101
true
false   unable to update event disposition: No such file or directory (fd:6)    2       thread: 0xb6d15ee8      nil6

NOTE: I've added printf("fd = %i\n", fd); to so_trace in src/lib/socket.c.

I wonder if starttls should even be called if the calls to connect fail?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants