Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(ClientRequest): use net.Socket interceptor #515

Merged
merged 82 commits into from
Jul 4, 2024

Conversation

kettanaito
Copy link
Member

@kettanaito kettanaito commented Mar 3, 2024

Yet another attempt at the Socket-based interceptor. This time, we are tapping into the HTTPParser from Node.js to give us the HTTP message parsing.

Roadmap

  • Implement SocketController.respondWith(). The challenge is to convert the Fetch API Response to the HTTP message and push it to the socket instance. We can use https://github.com/har-tools/http-message if nothing else works.
  • Implement error replays. Depends on how we handle the compatibility tests below.
    • I think duplicate connect and ready events are OK since those are internal and no consumers really depend on them. The main worry is the duplicate .write() callbacks since the request body would be written twice in the case of request listener (body read 1) + a bypassed request to an existing hostname (body read 2). We can always suppress the callbacks and then replay them.
  • Push the mockConnect() event until after the "request" interceptor event has been awaited.
    • Cannot do this. ClientRequest won't start writing the request body until it receives the "connect" and "ready" events from the socket.
  • Add a new test suite for compatibility
    • 1: Bypassed request to an existing host (no request listeners).
    • 2: Bypassed request to a non-existing host (no request listeners).
    • 3: Mocked request to an existing host (no response returned).
    • 4: Mocked request to an existing host (mocked response returned).
    • 5: Mocked request to a non-existing host (no response returned).
    • 6: Mocked request to a non-existing host (mocked response returned).
  • Socket compliance:
    • Respect the signal Socket constructor options (I presume for aborting the connection). Consider AbortSignal.timeout(n) as well. test(MockHttpSocket): add "signal" tests #559
    • Respect the right order of events in all compatibility scenarios.
    • Support Socket.setTimeout() that should time out the connection if it hasn't been established within the given time window. Support these for actual socket connections. test: add setTimeout tests #558
    • Support reusing the same Socket for multiple requests #594 Add compliance tests for the case when a single Socket is reused for multiple requests. We need to collocate request/responses correctly and handle their streams independently (likely bound the request/response stream to particular request/response instances instead of setting it as a property on the MockHttpSocket class).
  • Support TLS connections. Those use tls.connect(), not net.connect().
  • Add compliance test for non-HTTP transfer over Socket (e.g. SMTP, see this). We should do nothing about these, it's out of scope for this feature.
    • Irrelevant since we aren't patching net.connect() anymore. We utilize MockSocket to create a custom http.Agent and it's only applied to the http module.
  • Destroy the this.requestStream and this.responseStream at the right time. Consider that the consumer might've not consumed either of the streams by the time the *End() methods on MockHttpSocket are called (that will cause a premature close error).
  • Clean up ClientRequest/utils for it contains a lot of unused code now. chore: clean up ClientRequest utils #557
  • Inherit TLSSocket properties like .authorized and .encrypyed (see Socket: Respect "connect" and "setTimeout" behaviors #455). This is related to inheriting other properties like timeout and signal. (fix(MockHttpSocket): forward tls socket properties #556)
  • Update README.md since the interception algorithm is fundamentally different now.
  • Fix follow-redirect-http test, it's pending forever (skipped in 230c238).
  • fix(MockHttpSocket): handle response stream errors #548
  • Implement deferring of the response end event until all response event listeners are done (feat(ClientRequest): await response listeners before emitting the "end" response event #591 backport)

@kettanaito
Copy link
Member Author

We can spy on ClientRequest.prototype.write() and ClientRequest.prototype.end() to set an internal [kRequestStream] readable stream. That way, we can access the stream before the socket emits the connect event.

Since the body stream is the last thing we need to call the request listener, we can call the listener even before deciding whether to emit the connect+ready on the socket instance.

@mikicho, what do you think about this approach?

@mikicho
Copy link
Contributor

mikicho commented Mar 3, 2024

Yet another attempt at the Socket-based interceptor

The N time's a charm. :)

Since the body stream is the last thing we need to call the request listener, we can call the listener even before deciding whether to emit the connect+ready on the socket instance.

I like the direction you are going!

Would it work in the following case?

const req = http.request(...)
req.write('sync') // OK

await sleep(1000)
req.write('async') // I think it will work, but I'm not sure about it. my concern is that the ClientRequest would await for the `connect` event to continue
req.end()

@mikicho
Copy link
Contributor

mikicho commented Mar 3, 2024

It appears that the process.binding function is going to be removed.
Update: seems like we can use const { HTTPParser } = require('_http_common') (or import { HTTPParser } from 'node:_http_common') instead.

Also, a question, Would the socket-based interceptor support fetch as well? or it would still require a different interceptor?
If not, we may implement very similar apparch with setGlobalDispatcher

const self = this
const originalConnect = net.Socket.prototype.connect

net.Socket.prototype.connect = function (normalizedOptions) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be ...normalizedOptions here, if it's passed on as arguments to connect?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gioragutt, I think it should be but Node.js provides here a single normalized object of options. Let's dive into the source for net.Socket.connect to verify that's the case.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If that's the case, how is the spread in line 44 working? JS syntax wise it wouldn't make sense otherwise

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I write about this here:

/**
* @note In some cases, "Socket.prototype.connect" will receive already
* normalized arguments. The call signature of that method will differ:
* .connect(port, host, cb) // unnormalized
* .connect([options, cb, normalizedSymbol]) // normalized
* Check that and unwrap the arguments to have a consistent format.
*/
const unwrappedArgs = Array.isArray(args[0]) ? args[0] : args
const normalizedSocketConnectArgs = net._normalizeArgs(unwrappedArgs)
const createConnection = () => {
return originalConnect.apply(this, args)
}

TL;DR Node.js will normalize the args for .connect() in some cases but not the others. That's why we get different type of args if the socket connection is called as a part of http.request() and raw like new Socket().connect().

@kettanaito
Copy link
Member Author

kettanaito commented Mar 4, 2024

Request event duplication issue

With this approach, we have to mock the connection to the host every time in order for the ClientRequest (the higher-level consumer) to start streaming the buffered request body through the socket (it won't start that until it receives the connect event).

This means a few different sequences of events depending on the type of the request.

Request to an existing endpoint with a mocked response

  1. lookup (mocked).
  2. connect (mocked).
  3. ready (mocked).
  4. INTERNAL.
    1. Reading the request body.
    2. Firing "request" on the intercepted.
    3. Piping the mocked response to the socket.
  5. data.
  6. response.
  7. end.

Request to a non-existing endpoint with a mocked response

The same thing as the one above.

Request to a non-existing host without a mocked response

This is where things get a bit tricky.

  1. lookup (mocked).
  2. connect (mocked).
  3. ready (mocked).
  4. INTERNAL (the same request streaming to the interceptor).
  5. error. Emit the suppressed lookup (connection) error since there's no mock in the interceptor so the consumer must receive the error about requesting a non-existing host.

This scenario is different from the event order in the unpatched scenario. The initial lookup event would include an error, and mirror itself in the error event on the socket also.

In our case, the lookup/connect/ready are successful. This is problematic for a few reasons:

  1. Diverges from the "normal" event flow.
  2. The socket errored and when the client will try writing to it (request body), while we will be able to forward those chunks to the HTTPParser, the underlying socket will throw Error: write EBADF error, which I believe indicates an invalid write.

While no direct consumers care about the lookup/connect/ready events, the error event is rather crucial. In either case, keeping the event flow as close to the normal behavior as possible is desirable.

Ideas how to resolve this

Option 1: Shadow request

We can spy on the net.Socket, and whenever it's constructed, construct a "shadow" copy of it internally and run the entire request/response on that shadow while keeping the original socket pending.

class SocketController {
    constructor(socket) {
        this.socket = socket;
        // Create a shadow socket.
        this.shadow = new net.Socket(...socketArgs);
        
        // Always mock the connection on the shadow.
        this.mockConnect(this.shadow)
        
        // Observe the shadow, not the original socket.
        this.shadow.write = proxy
        this.shadow.push = proxy
        
        this.onShadowHasMock = (response) => {
            // If the shadow socket received a mocked response
            // from the "request" listener in the interceptor,
            // mock the original socket connection (it doesn't matter now)
            // and pipe the response to the original socket.
            this.mockConnect(this.socket)
            pipeResponse(response, this.socket)
        }
        
        this.onShadowHasNoMock = () => {
            // If the interceptor had no mocked response for this request,
            // continue with the original socket as-is.
            // The original socket has been pending up to this moment.
        }
        
    }
}

In order for us to still respect the original socket and not to implement a dummy Socket class to provision this "waiting" period, we can do something like an infinitely pending promise on the crucial methods of the original socket that we then resolve in case of onShadowHasNoMock.

this.socket.emit = applyProxyHere((...args) => {
    waitForShadow().then(() => {
        Reflect.apply(...args)
    })
})

If the shadow socket has a mock, the waitForShadow promise will keep pending and won't ever be resolved (alternatively, we can either resolve it with a flag or reject it and provide a .catch() handler). If the shadow socket finds no mocked response, it will resolve the promise, which will trigger the method on the original socket in the order that they were originally called.

This should include at least .emit and .write, but may require to add the promise plug to more methods.

Problems with this solution:

  • The shadow socket won't receive any writes until the original socket pipes them to the shadow socket. And the original socket won't start piping unless the ClientRequest receives the connect event from it. Catch 22.

@kettanaito kettanaito force-pushed the feat/yet-another-socket-interceptor branch from 54aa046 to 5d9a812 Compare March 4, 2024 12:25
@kettanaito
Copy link
Member Author

I tried solving the socket event order issue in f7fded8. I believe it works great, now I'm writing tests to confirm that.

@mikicho
Copy link
Contributor

mikicho commented Mar 5, 2024

this.mockConnect(this.shadow)

How is this propogate the connect/ready/lookup events to the ClientRequest so it will continue to write the data?

Catch 22

Problems with this solution:

Probably not a problem, but I want to make sure I understand, in this option we will emit connect/ready/lookup twice in case of non-mocked request, right?

While no direct consumers care about the lookup/connect/ready events

It seems like Axios doesn't use this, but got exposes this data to its users, I'm not sure for what tho...

@mikicho
Copy link
Contributor

mikicho commented Mar 5, 2024

I opened a feature request for official mocking API like undici has: nodejs/node#51979

@kettanaito
Copy link
Member Author

It seems like Axios doesn't use this, but got exposes this data to its users, I'm not sure for what tho...

It doesn't expose that data but rather hooks into those socket events to provide timings measurements. That's okay. If you take a look at the current state of the branch, it emits the events correctly for all scenarios we've talked about. There's no duplication.

I opened a feature request for official mocking API like undici has: nodejs/node#51979

This is great! I'd certainly love to hear Node's opinion on this. I would count on this landing any time soon so our work on the Socket interceptor is relevant for at least a few years.

Probably not a problem, but I want to make sure I understand, in this option we will emit connect/ready/lookup twice in case of non-mocked request, right?

Not anymore.

@mikicho
Copy link
Contributor

mikicho commented Mar 6, 2024

Not anymore.

This is impressive! how so?

@kettanaito
Copy link
Member Author

kettanaito commented Mar 9, 2024

Need help

I've pushed the latest state of the changes that supports both HTTP and HTTPS requests. I've moved things around a bit, mainly made the MockSocket to be agnostic of the type of the messages sent (just giving us the observability), introduced a MockHttpSocket and utilizes the mock socket and pipes its streams through the HTTP parser, and then integrates with the new _ClientRequestInterceptor that's now Agent-based (we aren't mocking the ClientRequest at all now).

I'm encountering errors when intercepting HTTP/HTTPS requests with a body. Error:

pnpm test:node test/modules/http/intercept/http.request.test.ts
 FAIL  modules/http/intercept/http.request.test.ts > intercepts a POST request
AbortError: The operation was aborted
 ❯ Readable.emit ../node:events:513:28

⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯
Serialized Error: { code: 'ABORT_ERR' }
Caused by: Error: Premature close
 ❯ Readable.emit ../node:events:513:28

⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯
Serialized Error: { code: 'ERR_STREAM_PREMATURE_CLOSE' }
⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯[1/1]⎯

This indicates either that the MockSocket connection has been closed (not the case, the connection is never established to begin with), or that the MockSocket is still in writable state when we try to .emit('data') when forwarding data events from the original socket (in the passthrough() method). The latter seems to be more likely since this.writable is still true. You can also observe the finish/end events on the mock socket to see that they are emitted after the passthrough() start.

This makes sense because right now we call socket.passthrough() in the interceptor on the request event, which fires as soon as the request headers are sent. There may be some time passing until the request body has been written completely. I think this indicates an issue that we mustn't start the passthrough until the socket is finished or is in the readable state.

I've tried things like

// the interceptor
socket.once('end', () => socket.passthrough())

But they haven't fixed the problem.

Isolated example

I've put up an isolated example of this in https://github.com/kettanaito/node-tls-mock. Strangely, all tests are passing there, even for requests with body. The socket implementation seems to be the same. I advise to inspect the differences first to see if I've missed something while porting it to this branch.

@mikicho
Copy link
Contributor

mikicho commented Apr 20, 2024

How do we need to handle Expect: 100-continue request header?

What Node.js does:

  1. flush headers (docs, source)
  2. Server side (should) returns 100 Continue response (docs)
  3. Node emits continue event. (docs)
Reproducible code
const { ClientRequestInterceptor } = require('@mswjs/interceptors/ClientRequest')
const http = require('http');
const interceptor = new ClientRequestInterceptor({
  name: 'my-interceptor',
})
interceptor.apply();
interceptor.on('request', ({ request }) => {
  request.respondWith(new Response())
});
const req = http.request({
  host: 'example.com',
  path: '/',
  headers: { Expect: '100-continue' },
})
req.on('response', res => {
  res.on('data', console.log)
  res.on('end', () => { })
})
req.on('continue', () => {
  console.log(1); // never get called when interceptor applied.
  req.end()
})

I wonder if we should emit a continue event before calling the interceptor.
I think I can open PR when we decide how to handle this.

// not connecting anymore, the response was mocked.
// If the socket has been destroyed, the error was mocked.
// Treat both as the result of the listener's call.
if (!socket.connecting || socket.destroyed) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So in the Socket world, the socket won't be writableEnded until .end() is called, which can take time (e.g. request body is streaming).

Instead, I rely on .connecting, which we immediately set to false now in .mockConnect(), or the .destroyed, which will also immediately be set to false it you mock an error.

@kettanaito
Copy link
Member Author

Nock (unintentionally?) covers an edge case of a request that ends in the connect event.

@mikicho, we cannot cover this. connect depends on req.end() so it cannot be called in the connect event listener when the interceptor is enabled. Are there any other details around that test case in Nock (is it a regression)?

How do we need to handle Expect: 100-continue request header?

This reads to be an http.ClientRequest level of behavior. Is it not happening already? That's odd.

We should support it but we should also be careful not to prematurely support it. As in, if it shouldn't belong to the Socket level and we ignore a potential bug that prevents Node.js from calling checkContinue() as it does in a non-interceptor scenario.

Let's add an integration test for this first and then see! Would love to dive deeper to learn what exactly Node.js does step-by-step, and at which step it stops when using the interceptor.

@kettanaito
Copy link
Member Author

@mikicho, hi! You've mentioned there are also some remaining issues left here. Could you please post them in the comment to this pull request? I'd love to see what's left so I can tackle some of them whenever I have a moment. Thank you!

@mikicho
Copy link
Contributor

mikicho commented Jun 28, 2024

This reads to be an http.ClientRequest level of behavior. Is it not happening already? That's odd.

I understand the issue. The server is the one who returns this response, and Nock mimics this behavior and emits continue automatically when Expect: 100-continue is present (source).


As far as I understand, mswjs/interceptors should be a platform for building tools like Nock and MSW. So I tried to mimic the writeContinue behavior on Nock's side in the request event, but then I encountered a couple of problems:

  1. Response does not support 100 HTTP code (source)
    PS: I don't know why Undici has this limitation, and I couldn't find it in the spec.

Uncaught RangeError: init["status"] must be in the range of 200 to 599, inclusive.

  1. With the current respondWith API, how can we return two responses (one with 100 and another when we get the request body)

Having said that, this looks like an esoteric edge case, so if it requires API changes, we might want to automatically writeContinue inside mswjs/interceptors.

@kettanaito
Copy link
Member Author

Destroying request/response parsers

We are currently correctly destroying the parsers, specifically:

The request parser is destroyed once the request emits the finish event, indicating that the request can no longer be modified:

this.once('finish', () => this.requestParser.free())

The response parser is destroyed once the .destroy() method is called on the socket, indicating that the socket is no longer operable:

The .destroy() method is called for both normal request handling and in case of an exception. We are always freeing the response parser.

@kettanaito
Copy link
Member Author

I believe this feature is ready for merging! 🎉 The test suite passes, the compatibility looks solid.

There are a few pending tasks for the Nock compatibility, which we've agreed with @mikicho to tackle after this is released. I don't wish to stall this update any longer. It's a huge change and it blocks any bugfixes with having to port them here as well.

Looking forward to releasing the Socket-based interceptor sometime this week!

@kettanaito kettanaito changed the title feat: implement net.Socket interceptor feat(ClientRequest): use net.Socket interceptor Jul 4, 2024
@kettanaito kettanaito merged commit 77a6996 into main Jul 4, 2024
2 checks passed
@kettanaito kettanaito deleted the feat/yet-another-socket-interceptor branch July 4, 2024 18:45
@kettanaito
Copy link
Member Author

Released: v0.32.0 🎉

This has been released in v0.32.0!

Make sure to always update to the latest version (npm i @mswjs/interceptors@latest) to get the newest features and bug fixes.


Predictable release automation by @ossjs/release.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants