The node forces internet call for tokenizer (2 work-arounds)

Hi — first, thank you for releasing VibeVoice-ComfyUI. The node works very well overall, but I discovered an issue that prevents it from running offline even when the tokenizer files are correctly downloaded and placed locally.
Start ComfyUI w/o internet to reproduce.

I wanted to share both the cause / workaround and suggest fix to help future users.

###  Problem Summary
Even if the Qwen2.5 tokenizer files are placed in:
ComfyUI/models/vibevoice/tokenizer/

the processor still attempts to reach HuggingFace:
https://huggingface.co/Qwen/Qwen2.5-1.5B/resolve/main/tokenizer_config.json
https://huggingface.co/Qwen/Qwen2.5-7B/resolve/main/tokenizer_config.json

This happens even though the log states files are present; output:
```
VibeVoice] Found Qwen tokenizer in: /home/username/ComfyUI/models/vibevoice/tokenizer
[VibeVoice] Found complete tokenizer at: /home/username/ComfyUI/models/vibevoice/tokenizer
[VibeVoice] Standard from_pretrained failed: expected str, bytes or os.PathLike object, not NoneType
[VibeVoice] Trying with allow remote files...
'(MaxRetryError("HTTPSConnectionPool(host='huggingface.co', port=443): Max retries exceeded with url: /Qwen/Qwen2.5-7B/resolve/main/tokenizer_config.json 
```

It can be useful to run ComfyUI and generative models on offline machines, particularly testing unfamiliar code from others.


###  Cause of the Problem
Inside vibevoice_processor.py, the processor loads:
JSON"language_model_pretrained_name": "Qwen/Qwen2.5-1.5B"

(or "Qwen/Qwen2.5-7B" for the 7B model)

as specified from each model’s preprocessor_config.json.
If this field points to a HuggingFace repo name instead of a local directory, I think the loader will attempt a local load (which fails due to a NoneType or mismatched path)
... then fall back to:

`Pythonkwargs['local_files_only'] = False  # enable online fetching
'
I think this forces a second attempt that ALWAYS hits HuggingFace, even when local files exist.
So the offline mode doesn’t work unless the user manually updates the config.

### ✅ Workaround editing the VibeVoice preprocessor_config.json file (Confirmed Working)

In each model’s directory, small easy to edit files:
ComfyUI\models\vibevoice\VibeVoice-1.5B\preprocessor_config.json
ComfyUI\models\vibevoice\VibeVoice-7B\preprocessor_config.json

Change this line:
`JSON"language_model_pretrained_name": "Qwen/Qwen2.5-1.5B"
`(or "Qwen/Qwen2.5-7B")
to the local tokenizer folder, for example in my case:
`JSON"language_model_pretrained_name": "\home\<user>\ComfyUI\models\vibevoice\tokenizer"
`
Once set, both 1.5B and 7B models load entirely offline without any network attempts.


### 
### ✅ Workaround editing one of your node's files: vibevoice_processor.py (Confirmed Working)
`\home\<user>\ComfyUI\Custom Nodes\VibeVoice-ComfyUI\vvembed\processor\vibevoice_processor.py`

With my objective not being fixing for all just:

-  completely prevents online access
-  still loads the tokenizer from your local tokenizer folder
-  requires the fewest code changes
-  does NOT require rewriting the full logic
- 

Inside vibevoice_processor.py, the online fallback is triggered by this block:

change

```
except Exception as e:
    logger.warning(f"Standard from_pretrained failed: {e}")
    logger.info("Trying with allow remote files...")
    kwargs['local_files_only'] = False
    tokenizer = VibeVoiceTextTokenizerFast.from_pretrained(
        language_model_pretrained_name,
        **kwargs
    )

```

to
```
except Exception as e:
    logger.warning(f"Standard from_pretrained failed: {e}")
    logger.info("Falling back to local tokenizer directory instead of remote.")

    # Force loading from the known local tokenizer folder:
    local_tokenizer_dir = os.path.join(
        os.path.dirname(pretrained_model_name_or_path),
        "tokenizer"
    )

    tokenizer = VibeVoiceTextTokenizerFast.from_pretrained(
        local_tokenizer_dir,
        local_files_only=True
    )

```
###  Suggestion
To avoid requiring every user to patch their configs manually, you could modify vibevoice_processor.py so that:
 1. If a local tokenizer directory exists, prefer it automatically
Regardless of the value in preprocessor_config.json.
 2. Do NOT force online fallback unless explicitly requested


Thanks again for the great work — best of the VibeVoice nodes I've tried. Would be great if this forced-online behaviour could be fixed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The node forces internet call for tokenizer (2 work-arounds) #215

Problem Summary

Cause of the Problem

✅ Workaround editing the VibeVoice preprocessor_config.json file (Confirmed Working)

✅ Workaround editing one of your node's files: vibevoice_processor.py (Confirmed Working)

Suggestion

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

The node forces internet call for tokenizer (2 work-arounds) #215

Description

Problem Summary

Cause of the Problem

✅ Workaround editing the VibeVoice preprocessor_config.json file (Confirmed Working)

✅ Workaround editing one of your node's files: vibevoice_processor.py (Confirmed Working)

Suggestion

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions