Skip to content

The node forces internet call for tokenizer (2 work-arounds) #215

@808charlie

Description

@808charlie

Hi — first, thank you for releasing VibeVoice-ComfyUI. The node works very well overall, but I discovered an issue that prevents it from running offline even when the tokenizer files are correctly downloaded and placed locally.
Start ComfyUI w/o internet to reproduce.

I wanted to share both the cause / workaround and suggest fix to help future users.

Problem Summary

Even if the Qwen2.5 tokenizer files are placed in:
ComfyUI/models/vibevoice/tokenizer/

the processor still attempts to reach HuggingFace:
https://huggingface.co/Qwen/Qwen2.5-1.5B/resolve/main/tokenizer_config.json
https://huggingface.co/Qwen/Qwen2.5-7B/resolve/main/tokenizer_config.json

This happens even though the log states files are present; output:

VibeVoice] Found Qwen tokenizer in: /home/username/ComfyUI/models/vibevoice/tokenizer
[VibeVoice] Found complete tokenizer at: /home/username/ComfyUI/models/vibevoice/tokenizer
[VibeVoice] Standard from_pretrained failed: expected str, bytes or os.PathLike object, not NoneType
[VibeVoice] Trying with allow remote files...
'(MaxRetryError("HTTPSConnectionPool(host='huggingface.co', port=443): Max retries exceeded with url: /Qwen/Qwen2.5-7B/resolve/main/tokenizer_config.json 

It can be useful to run ComfyUI and generative models on offline machines, particularly testing unfamiliar code from others.

Cause of the Problem

Inside vibevoice_processor.py, the processor loads:
JSON"language_model_pretrained_name": "Qwen/Qwen2.5-1.5B"

(or "Qwen/Qwen2.5-7B" for the 7B model)

as specified from each model’s preprocessor_config.json.
If this field points to a HuggingFace repo name instead of a local directory, I think the loader will attempt a local load (which fails due to a NoneType or mismatched path)
... then fall back to:

`Pythonkwargs['local_files_only'] = False # enable online fetching
'
I think this forces a second attempt that ALWAYS hits HuggingFace, even when local files exist.
So the offline mode doesn’t work unless the user manually updates the config.

✅ Workaround editing the VibeVoice preprocessor_config.json file (Confirmed Working)

In each model’s directory, small easy to edit files:
ComfyUI\models\vibevoice\VibeVoice-1.5B\preprocessor_config.json
ComfyUI\models\vibevoice\VibeVoice-7B\preprocessor_config.json

Change this line:
JSON"language_model_pretrained_name": "Qwen/Qwen2.5-1.5B" (or "Qwen/Qwen2.5-7B")
to the local tokenizer folder, for example in my case:
JSON"language_model_pretrained_name": "\home\<user>\ComfyUI\models\vibevoice\tokenizer"
Once set, both 1.5B and 7B models load entirely offline without any network attempts.

✅ Workaround editing one of your node's files: vibevoice_processor.py (Confirmed Working)

\home\<user>\ComfyUI\Custom Nodes\VibeVoice-ComfyUI\vvembed\processor\vibevoice_processor.py

With my objective not being fixing for all just:

  • completely prevents online access
  • still loads the tokenizer from your local tokenizer folder
  • requires the fewest code changes
  • does NOT require rewriting the full logic

Inside vibevoice_processor.py, the online fallback is triggered by this block:

change

except Exception as e:
    logger.warning(f"Standard from_pretrained failed: {e}")
    logger.info("Trying with allow remote files...")
    kwargs['local_files_only'] = False
    tokenizer = VibeVoiceTextTokenizerFast.from_pretrained(
        language_model_pretrained_name,
        **kwargs
    )

to

except Exception as e:
    logger.warning(f"Standard from_pretrained failed: {e}")
    logger.info("Falling back to local tokenizer directory instead of remote.")

    # Force loading from the known local tokenizer folder:
    local_tokenizer_dir = os.path.join(
        os.path.dirname(pretrained_model_name_or_path),
        "tokenizer"
    )

    tokenizer = VibeVoiceTextTokenizerFast.from_pretrained(
        local_tokenizer_dir,
        local_files_only=True
    )

Suggestion

To avoid requiring every user to patch their configs manually, you could modify vibevoice_processor.py so that:

  1. If a local tokenizer directory exists, prefer it automatically
    Regardless of the value in preprocessor_config.json.
  2. Do NOT force online fallback unless explicitly requested

Thanks again for the great work — best of the VibeVoice nodes I've tried. Would be great if this forced-online behaviour could be fixed.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions