Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NodeJS Local Files Only - Headers Not Defined & Incorrect Path Splitters #520

Open
1 of 5 tasks
axrati opened this issue Jan 13, 2024 · 17 comments
Open
1 of 5 tasks
Labels
bug Something isn't working

Comments

@axrati
Copy link

axrati commented Jan 13, 2024

System Info

Windows 10 - 10.0.19045 Build 19045
Alienware m17 R3
CPU - Intel i7-10750H

Node version:
v16.14.2

main.mjs:

import { pipeline } from '@xenova/transformers';

let pipe = await pipeline('feature-extraction','gte-small',{local_files_only:true});
let out = await pipe('Hey model! Respond to me!');

Package.json:

{
  "name": "js-hf",
  "version": "1.0.0",
  "description": "",
  "main": "main.mjs",
  "scripts": {
    "test": "echo \"Error: no test specified\" && exit 1"
  },
  "author": "",
  "license": "ISC",
  "dependencies": {
    "@xenova/transformers": "^2.14.0"
  }
}

Clone this repository directly into root of project:
https://huggingface.co/Supabase/gte-small

Your final project outlook will look like this:

--${YOUR_PROJ_NAME}
----- gte-small
----- node_modules
----- main.mjs
----- package-lock.json
----- package.json

Environment/Platform

  • Website/web-app
  • Browser extension
  • Server-side (e.g., Node.js, Deno, Bun)
  • Desktop app (e.g., Electron)
  • Other (e.g., VSCode extension)

Description

When trying to import models locally, it looks like there are still HTTP requests trying to be fired off. Expected behavior is that when local_files_only is true, that it would only try to use local files.

Secondarily, it looks like the paths to load assets is incorrect on a Windows computer. It is using / instead of \ for transformer assets. It also doesnt seem to be respecting relative path vs absolute path... perhaps that needs to be changed as well?

Error output:

Axrati@DESKTOP-H8KG7FT MINGW64 ~/Desktop/Code/js-hf
$ node main.mjs
Unable to load from local path "C:\Users\Axrati\Desktop\Code\js-hf\node_modules\@xenova\transformers\models\/gte-small/tokenizer.json": "ReferenceError: Headers is not defined"
Unable to load from local path "C:\Users\Axrati\Desktop\Code\js-hf\node_modules\@xenova\transformers\models\/gte-small/tokenizer_config.json": "ReferenceError: Headers is not defined"
Unable to load from local path "C:\Users\Axrati\Desktop\Code\js-hf\node_modules\@xenova\transformers\models\/gte-small/config.json": "ReferenceError: Headers is not defined"
file:///C:/Users/Axrati/Desktop/Code/js-hf/node_modules/@xenova/transformers/src/utils/hub.js:462
                    throw Error(`\`local_files_only=true\` or \`env.allowRemoteModels=false\` and file was not found locally at "${localPath}".`);
                          ^

Error: `local_files_only=true` or `env.allowRemoteModels=false` and file was not found locally at "C:\Users\Axrati\Desktop\Code\js-hf\node_modules\@xenova\transformers\models\/gte-small/tokenizer.j
    at getModelFile (file:///C:/Users/Axrati/Desktop/Code/js-hf/node_modules/@xenova/transformers/src/utils/hub.js:462:27)
    at async getModelJSON (file:///C:/Users/Axrati/Desktop/Code/js-hf/node_modules/@xenova/transformers/src/utils/hub.js:575:18)
    at async Promise.all (index 0)
    at async loadTokenizer (file:///C:/Users/Axrati/Desktop/Code/js-hf/node_modules/@xenova/transformers/src/tokenizers.js:61:18)
    at async Function.from_pretrained (file:///C:/Users/Axrati/Desktop/Code/js-hf/node_modules/@xenova/transformers/src/tokenizers.js:4296:50)
    at async Promise.all (index 0)
    at async loadItems (file:///C:/Users/Axrati/Desktop/Code/js-hf/node_modules/@xenova/transformers/src/pipelines.js:3115:5)
    at async pipeline (file:///C:/Users/Axrati/Desktop/Code/js-hf/node_modules/@xenova/transformers/src/pipelines.js:3055:21)
    at async file:///C:/Users/Axrati/Desktop/Code/js-hf/main.mjs:5:12

As you can see, the "Unable to read local path" is trying to reference node_modules\@xenova\transformers\models\/gte-small/tokenizer.json, which wouldn't be valid Windows path... That said, it looks to not be respecting the relative path (if you see the System Requirements section, you can see the model is a directory in the root of the project, and this is searching through your library in node_modules)

If you look at your code in https://github.com/xenova/transformers.js/blob/main/src/utils/hub.js, you can see on lines 55-56 that the constructor for a FileResponse is instantiating Headers. This leads me to believe that even if the getFile function had its first 2 criteria met (env.useFS && !isValidHttpUrl(urlOrPath))), that its still executing unnecessary code for the protocol its trying to use.

I am happy to help create a PR for this! Please reach out and let me know. Would be helpful to catch up with someone on the team for repo direction/etc.

Reproduction

Based on steps in Sys Reqs / Description

npm install
node main.mjs

@axrati axrati added the bug Something isn't working label Jan 13, 2024
@xenova
Copy link
Owner

xenova commented Jan 13, 2024

Node version: v16.14.2

Hi there 👋 Transformers.js requires Node.js v18+ to function correctly. Since Node 16 has reached EOL, we will not be adding support for it in future. See here for more information.

@axrati
Copy link
Author

axrati commented Jan 13, 2024

Hello! - Thanks for the quick response! :)

I upgrade to Node v20.11.0 and reinstalled node_modules, still seeing path error. When adding absolute path, it still seems to append it to search through your library directly. Happy to help on this if you'd like!

$ node main.mjs
file:///C:/Users/Axrati/Desktop/Code/js-hf/node_modules/@xenova/transformers/src/utils/hub.js:462
                    throw Error(`\`local_files_only=true\` or \`env.allowRemoteModels=false\` and file was not found locally at "${localPath}".`);
                          ^

Error: `local_files_only=true` or `env.allowRemoteModels=false` and file was not found locally at "C:\Users\Axrati\Desktop\Code\js-hf\node_modules\@xenova\transformers\models\/gte-small/tokenizer.j
son".
    at getModelFile (file:///C:/Users/Axrati/Desktop/Code/js-hf/node_modules/@xenova/transformers/src/utils/hub.js:462:27)
    at async getModelJSON (file:///C:/Users/Axrati/Desktop/Code/js-hf/node_modules/@xenova/transformers/src/utils/hub.js:575:18)
    at async Promise.all (index 0)
    at async loadTokenizer (file:///C:/Users/Axrati/Desktop/Code/js-hf/node_modules/@xenova/transformers/src/tokenizers.js:61:18)
    at async AutoTokenizer.from_pretrained (file:///C:/Users/Axrati/Desktop/Code/js-hf/node_modules/@xenova/transformers/src/tokenizers.js:4296:50)
    at async Promise.all (index 0)
    at async loadItems (file:///C:/Users/Axrati/Desktop/Code/js-hf/node_modules/@xenova/transformers/src/pipelines.js:3115:5)
    at async pipeline (file:///C:/Users/Axrati/Desktop/Code/js-hf/node_modules/@xenova/transformers/src/pipelines.js:3055:21)
    at async file:///C:/Users/Axrati/Desktop/Code/js-hf/main.mjs:4:12

Node.js v20.11.0

@axrati
Copy link
Author

axrati commented Jan 13, 2024

I was able to fix this with the following code:

import { pipeline, env } from '@xenova/transformers';
env.localModelPath = './';
let pipe = await pipeline('feature-extraction','gte-small',{local_files_only:true});

I think changing the semantics of this may benefit more diverse projects, I am building an Electron app for users to point to and use models wherever they may be on their computer.

If I start setting env.localModelPath, any time I try to reference an arbitrary model on their computer it needs to be relative to the Electron apps path (or whatever I set there). Would much rather have the ability to provide both relative and absolute paths. I would suggest perhaps a default variable for that (relative vs absolute) in the config alongside local_files_only?

Happy to make the changes myself, please let me know your thoughts!

@axrati
Copy link
Author

axrati commented Jan 20, 2024

@xenova - bump! Let me know and I will start a PR for this

@hiepxanh
Copy link

I have the same issue too, I dont know why my model is there but it cannot file that path:
env.localModelPath = new URL('../../../../../', import.meta.url).pathname;

this is not working too

env.localModelPath = '../../../../../';

Error: `local_files_only=true` or `env.allowRemoteModels=false` and file was not found locally at "/C:/Users/hiepx/small-cosmos/colbert-ir/colbertv2.0/tokenizer.json".
    at getModelFile (file:///C:/Users/hiepx/small-cosmos/losa/node_modules/.pnpm/@[email protected]/node_modules/@xenova/transformers/src/utils/hub.js:462:27)

but it actually there @axrati I think you should create an PR for this

@axrati
Copy link
Author

axrati commented Jan 24, 2024

@hiepxanh Can you provide sample code and your directory structure so I can use as test case as well as mine?

@axrati
Copy link
Author

axrati commented Jan 25, 2024

@xenova @hiepxanh - I have found a quick patch. unable to push a branch up though?

$ git push origin explicit-path
remote: Permission to xenova/transformers.js.git denied to axrati.
fatal: unable to access 'https://github.com/xenova/transformers.js.git/': The requested URL returned error: 403

@axrati
Copy link
Author

axrati commented Jan 26, 2024

@xenova - the change here isn't major, and doesnt supply full vs relative path. Its an issue with how localPath & requestURL are derived. Please let me open branch to submit a PR!

@axrati
Copy link
Author

axrati commented Jan 28, 2024

@xenova Bump!

@xenova
Copy link
Owner

xenova commented Jan 29, 2024

@xenova @hiepxanh - I have found a quick patch. unable to push a branch up though?

$ git push origin explicit-path
remote: Permission to xenova/transformers.js.git denied to axrati.
fatal: unable to access 'https://github.com/xenova/transformers.js.git/': The requested URL returned error: 403

Hi there 👋 Feel free to fork the repository, then submit a pull request. In that way, I can review your changes.

@TrumanDu
Copy link

the same issue to me!

@axrati
Copy link
Author

axrati commented Feb 22, 2024

@xenova , @hiepxanh, @lsb , @TrumanDu

Sorry for the delay on this, have been working on other projects. Opened a forked PR here for review!: #602

@TrumanDu
Copy link

TrumanDu commented Mar 6, 2024

@axrati thanks for you contribute,realy need you PR.

@axrati
Copy link
Author

axrati commented Mar 13, 2024

@TrumanDu @hiepxanh @lsb @xenova

No problem TrumanDu :) ... xenova - can you please check the PR? Small but effective change! #602

@axrati
Copy link
Author

axrati commented Mar 26, 2024

@xenova - reminder for this PR! if you can approve the checks to run it'd help ~

@wujohns
Copy link

wujohns commented Apr 17, 2024

same issue, any update now?

@axrati
Copy link
Author

axrati commented Apr 17, 2024

@xenova @wujohns

I have a PR open to address this! Need it to be checked:

#602

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

5 participants