Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Vercel Proxying: add config to test proxying #1957

Closed
wants to merge 7 commits into from

Conversation

danielbeardsley
Copy link
Member

@danielbeardsley danielbeardsley commented Sep 4, 2023

We'd like to try experimenting with pointing www.ifixit.com at vercel and have vercel proxy any missed routes to our main php app.

Before we make changes at the CDN level, let's set up this rewrite to do some experimentation.

Note: this proxies to a domain that is configured here: https://github.com/iFixit/ifixit/pull/49636

Connect #1884

@vercel
Copy link

vercel bot commented Sep 4, 2023

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
react-commerce ✅ Ready (Inspect) Visit Preview 💬 Add feedback Sep 6, 2023 11:51pm
react-commerce-prod ✅ Ready (Inspect) Visit Preview 💬 Add feedback Sep 6, 2023 11:51pm

We'd like to try experimenting with pointing www.ifixit.com at vercel
and have vercel proxy any missed routes to our main php app.

Before we make changes at the CDN level, let's set up this rewrite to do
some experimentation.
We'd like to test the performance (and features) of proxying through
our proxylb. So, let's add a few more fallback routes.

I'm also not sure if query strings get forwarded in this proxying setup,
so this also allows for testing of that.

/health-check is a fast-responding endpoint in apache that always
returns a 200 from the app machines.
@danielbeardsley
Copy link
Member Author

deploy_block 👍

We don't exactly need to merge this in order to test it, so let's hold off.

@danielbeardsley
Copy link
Member Author

@davidrans and I did a bunch of experimenting and benchmarking. We used the hey tool (apache bench is old and doesn't speak http2).

Here's what we learned from the places we tested:

  • The rewrite config in this pull is executed at the edge
  • Query params and cookies are passed through to the origin (Yay!!!)
  • Bunches of numbers:
    • vercel means vercel's CDN
    • php-app means the app machines
    • proxylbs means whatever LB constellix thinks is closest
From: us-east-1 (ifixit lb)
cloudfront -> proxylbs -> php-app: 9ms
proxylbs -> php-app: 2.5ms
vercel -> proxylbs -> php app: 31ms
vercel -> us-east lb -> php app: 30ms

From us-west-1:
cloudfront -> proxylbs -> php-app: 155ms
proxylbs -> php-app: 126ms
vercel -> proxylbs -> php app: 157ms
vercel -> us-east lb -> php app: 215ms

From Danny's computer:
cloudfront -> proxylbs -> php-app: 292ms
vercel -> proxylbs -> php app: 428ms (repeatable and surprising that it's 130ms slower than through CF)
vercel -> us-east lb -> php app: 467ms

From David's computer:
cloudfront -> proxylbs -> php-app: 135ms
vercel -> proxylbs -> php app: 267ms (repeatable and surprising)

From eu-west-2:
proxylbs -> php-app: 80ms
cloudfront -> proxylbs -> php-app: 85ms
vercel -> proxylbs -> php app: 93ms
vercel -> us-east lb -> php app: 236ms (repeatable and surprising, indicating vercel isn't keeping an ssl connection alive to our LB)

Note: all these tests were done with /health-check which responds with an empty 204

The short of it is that vercel -> proxylbs -> php-app looks like a reasonable option and was very easy to configure.

If it's not too hard, I'd like to test proxying from within the main next.js process. Thinking that vercel has optimized the Edge -> Washington DC route and hopefully keeps connections alive.

@dhmacs
Copy link
Contributor

dhmacs commented Sep 5, 2023

If it's not too hard, I'd like to test proxying from within the main next.js process. Thinking that vercel has optimized the Edge -> Washington DC route and hopefully keeps connections alive.

Do you mean by adding a catch all route?

@danielbeardsley
Copy link
Member Author

Do you mean by adding a catch all route?

I'm open to any way that can test it out. The little research I didn't on using a middleware to do it indicated the middleware would also run on the edge.

If you've got a solution in mind, feel free to push a commit here. I'd love to experiment with performance on that option.

@github-actions
Copy link
Contributor

github-actions bot commented Sep 5, 2023

📦 Next.js Bundle Analysis for @ifixit/commerce-frontend

This analysis was generated by the Next.js Bundle Analysis action. 🤖

This PR introduced no changes to the JavaScript bundle! 🙌

Note: unfortunately, you can't use: return NextResponse.rewrite(...)
here, otherwise you get an error like:

    [Next] - error Error: API route returned a Response object in the Node.js runtime, this is not supported. Please use
    `runtime: "edge"` instead: https://nextjs.org/docs/api-routes/edge-api-routes

Which is precisely what we *don't* want.
frontend/package.json Outdated Show resolved Hide resolved
@sterlinghirsh
Copy link
Member

CR ⚡ dev_block ⚡ on a potential version mismatch but otherwise this looks like a reasonable test.

Yes, we're using a newer lib version than the types we have... but I'd
rather not downgrade to an older version and we use so few features.
@danielbeardsley
Copy link
Member Author

danielbeardsley commented Sep 6, 2023

Updated Numbers (with new proxy option)

From: us-east-1 (ifixit lb)
cloudfront -> proxylbs -> php-app: 9ms
proxylbs -> php-app: 2.5ms
vercel -> next.js (in us-east) -> php app: 52ms
vercel -> proxylbs -> php app: 31ms
vercel -> us-east lb -> php app: 30ms

From us-west-1:
cloudfront -> proxylbs -> php-app: 155ms
proxylbs -> php-app: 126ms
vercel -> next.js (in us-east) -> php app: 111ms
vercel -> proxylbs -> php app: 170ms
vercel -> us-east lb -> php app: 250ms

From Danny's computer:
cloudfront -> proxylbs -> php-app: 292ms
vercel -> next.js (in us-east) -> php app: 350ms
vercel -> proxylbs -> php app: 428ms
vercel -> us-east lb -> php app: 467ms

From David's computer:
cloudfront -> proxylbs -> php-app: 135ms
vercel -> proxylbs -> php app: 267ms (repeatable and surprising)
vercel -> next.js (in-us-east) -> php app: 215ms

From eu-west-2:
proxylbs -> php-app: 80ms
cloudfront -> proxylbs -> php-app: 85ms
vercel -> next.js (in us-east) -> php app: 127ms
vercel -> proxylbs -> php app: 100-150ms
vercel -> us-east lb -> php app: 263ms

@danielbeardsley
Copy link
Member Author

To be clear, we don't need to merge this to test it. We've been testing the react-commerce-prod preview deployment.

This prevents next.js from warning about unhandled requests.

> externalResolver is an explicit flag that tells the server that this
> route is being handled by an external resolver like express or connect.
> Enabling this option disables warnings for unresolved requests.
@danielbeardsley
Copy link
Member Author

When trying to use http-proxy from the edge:
image

Though next.js does proxying from the edge, so it's not clear what tools they use.... I'll try creating a proxy using just fetch()

@danielbeardsley
Copy link
Member Author

I was able to cobble together a proxy using fetch() that can run on the edge. All four proxying options are now being tested once / minute with Alertra:
image

Tomorrow we can evaluate their averages using a spreadsheet.

@danielbeardsley
Copy link
Member Author

I've got a spreadsheet going but I forgot one more proxy to monitor (the current CF -> proxylbs) one. I've added it and am awaiting the numbers.

@danielbeardsley
Copy link
Member Author

danielbeardsley commented Sep 8, 2023

These are the numbers thus far, all times are in ms

Performance

Alertra Name Proxy route North America (ms) Europe (ms) Asia / Pacific (ms)
v-prxy vercel rewrite -> proxylbs 200 263 654
v-east vercel rewrite -> us-east-lb 236 414 905
vn-edge next api on edge -> us-east-lb 340 485 1170
vn-east vercel -> next api (us-east) -> us-east-lb 222 269 487
cf-proxy Cloudfront -> proxylbs 210 248 696

Costs

I've looked at some graphs and it seems we use about 250ksec of request time on the php app per day. This amounts to approximately: (250000sec / 3600sec) * (200 MB RAM) * 30 days = 416 GB Hours per month which will cost about $40 * 5 = $200 per month in execution costs (for the next.js proxy (vn-east) option).

We also use about ~80GB per day of bandwidth which would cost:
(80 GB/day) * (30 Days) * ($40 / 100GB) = $960 per month... Wow. That's more than I thought.

Unless we sign up for the enterprise plan... which could be more or less expensive.

@dhmacs
Copy link
Contributor

dhmacs commented Sep 11, 2023

Thanks for doing this work, very interesting insights @danielbeardsley !

I've looked at some graphs and it seems we use about 250ksec of request time on the php app per day. This amounts to approximately: (250000sec / 3600sec) * (200 MB RAM) * 30 days = 416 GB Hours per month which will cost about $40 * 5 = $200 per month in execution costs (for the next.js proxy (vn-east) option).

We also use about ~80GB per day of bandwidth which would cost:
(80 GB/day) * (30 Days) * ($40 / 100GB) = $960 per month... Wow. That's more than I thought.

Maybe by leveraging the CDN cache a bit more there's room for reducing costs, but I definitely agree that bandwidth costs are not cheap. 😬

If there's appetite for saving on infra costs, we can also skip the intermediary and go directly to the source: https://pages.cloudflare.com/

@danielbeardsley
Copy link
Member Author

The costs would actually be somewhat lower in practice. Here's an annotated estimation:

Current Cloudfront Bytes per day (includes proxied vercel traffic): 84GB
Current Vercel bytes per day: 15GB

Total Bytes per day that would be moved from CF to vercel: 69GB

We are only using ~500GB/month of our current 1TB / month vercel bandwidth

Current Cloudfront cost for the bytes in question is $126
Additional cost for Vercel bytes is $40 / 100GB

((84GB - 15GB) * 30 days - 500GB) / (100GB / $40) - $126 = $502/month

So, an additional $500 per month. Probably worth talking to Vercel and asking about enterprise deals. Buuuut, we don't want to get hoodwinked into paying a higher price per GB than the current on-demand rates.

@danielbeardsley
Copy link
Member Author

danielbeardsley commented Sep 20, 2023

Here is an updated and easier to interpret performance chart. I've also added a line (proxylbs) to represent not using cloudfront at all and just dependending on our proxy lbs.

Difference from current (Cloudfront -> Proxylbs) performance:

-- North America Europe Asia / Pacific Sum
vercel rewrite -> proxylbs -13 ms 54 ms -62 ms -21 ms
vercel rewrite -> us-east-lb 25 ms 201 ms 202 ms 428 ms
next api on edge -> us-east-lb 147 ms 288 ms 247 ms 682 ms
vercel -> next api (us-east) -> us-east-lb -18 ms 47 ms -224 ms -195 ms
proxylbs -72 ms -34 ms -61 ms -167 ms
Cloudfront -> proxylbs (current) 0 ms 0 ms 0 ms 0 ms

@danielbeardsley
Copy link
Member Author

The research is done here, we've learned the lessons; I'm gonna freeze this.

@danielbeardsley danielbeardsley added the Cryogenic Storage Hides an open pull from Pulldasher label Sep 26, 2023
@danielbeardsley
Copy link
Member Author

This can be re-opened (if desired) in this code's new home

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Cryogenic Storage Hides an open pull from Pulldasher
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants