enhance: Add Hugging Face inference provider support by junjiejiangjjj · Pull Request #50818 · milvus-io/milvus

junjiejiangjjj · 2026-06-26T03:22:21Z

Add Hugging Face Inference Providers client support for feature extraction and sentence similarity APIs, and wire it into text embedding and rerank model providers.

The new provider supports:

text embedding via feature-extraction
rerank scoring via sentence-similarity
Hugging Face router provider selection with hf_provider
MILVUS_HUGGINGFACE_API_KEY credential fallback
provider config entries for text embedding and rerank

Also add focused tests for the Hugging Face client, rerank provider, and paramtable provider docs.

Add Hugging Face Inference Providers client support for feature extraction and sentence similarity APIs, and wire it into text embedding and rerank model providers. The new provider supports: - text embedding via feature-extraction - rerank scoring via sentence-similarity - Hugging Face router provider selection with hf_provider - MILVUS_HUGGINGFACE_API_KEY credential fallback - provider config entries for text embedding and rerank Also add focused tests for the Hugging Face client, rerank provider, and paramtable provider docs. Signed-off-by: junjie.jiang <junjie.jiang@zilliz.com>

sre-ci-robot · 2026-06-26T03:22:39Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: junjiejiangjjj
To complete the pull request process, please assign yanliang567 after the PR has been reviewed.
You can assign the PR to them by writing /assign @yanliang567 in a comment when ready.

The full list of commands accepted by this bot can be found here.

Details

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

sre-ci-robot · 2026-06-26T03:24:25Z

[ci-v2-notice]
Notice: New ci-v2 system is enabled for this PR.

To rerun ci-v2 checks, comment with:

/ci-rerun-code-check // for ci-v2/code-check
/ci-rerun-code-check-macos // for Code Checker MacOS (GitHub Actions)
/ci-rerun-build // for ci-v2/build
/ci-rerun-build-all // for ci-v2/build-all (multi-arch builds)
/ci-rerun-buildenv // for ci-v2/build-env (build milvus-env builder images; update .env after the new tag is ready)
/ci-rerun-ut-integration // for ci-v2/ut-integration, will rerun ci-v2/build
/ci-rerun-ut-go // for ci-v2/ut-go, will rerun ci-v2/build
/ci-rerun-ut-cpp // for ci-v2/ut-cpp
/ci-rerun-ut // for all ci-v2/ut-integration, ci-v2/ut-go, ci-v2/ut-cpp, will rerun ci-v2/build
/ci-rerun-e2e-default // for ci-v2/e2e-default
/ci-rerun-e2e-amd // for ci-v2/e2e-amd (e2e pool dispatcher)
/ci-rerun-build-ut-cov // for ci-v2/build-ut-cov (build + unit tests in one pipeline)
/ci-rerun-gosdk // for ci-v2/go-sdk (Go SDK E2E tests, ARM)

If you have any questions or requests, please contact @zhikunyao.

liliu-z · 2026-06-26T04:55:20Z

+	return provider.fieldDim
+}
+
+func (provider *HuggingFaceEmbeddingProvider) CallEmbedding(_ context.Context, texts []string, _ models.TextEmbeddingMode) (any, error) {


CallEmbedding(_ context.Context, texts []string, _ models.TextEmbeddingMode) ignores the embedding mode and sends the single configured prompt_name on every request, so inserts and searches are encoded with the same prompt. Asymmetric retrieval models (E5/BGE/GTE/Qwen) need different query vs document prompts — the sibling TEI provider already switches ingestion_prompt/search_prompt by mode and Gemini uses RETRIEVAL_DOCUMENT vs RETRIEVAL_QUERY — so a config like prompt_name=query applies the query prompt to documents at insert time, silently degrading retrieval quality with no knob to fix it. Add mode-specific HF prompt params by mapping the existing ingestion/search prompt concepts into the request before calling FeatureExtraction.

Hugging Face Inference Providers expose a single feature-extraction pipeline API and do not define separate query/document embedding endpoints or mode-specific prompt fields.

liliu-z · 2026-06-26T04:55:21Z

+	if err := json.Unmarshal(raw, &tokenLevel); err == nil && len(tokenLevel) > 0 {
+		return nil, merr.WrapErrFunctionFailedMsg("Hugging Face feature-extraction returned token-level embeddings; please use a sentence embedding model or configure pooling")
+	}
+	return nil, merr.WrapErrFunctionFailedMsg("unsupported Hugging Face feature-extraction response format")


A 200 response whose body is not a recognized embedding array (e.g. an HF error envelope like {"error":"..."}) falls through to a generic unsupported Hugging Face feature-extraction response format error that discards the body, so HF's actual message is lost and the failure is opaque to diagnose. Because FeatureExtractionResponse is a json.RawMessage, PostRequest's unmarshal accepts the envelope and it reaches here; the rerank path similarly collapses such a body into unmarshal response failed. Surface the response body in the error (and, if a known transient envelope, treat it as retriable) instead of dropping it.

sre-ci-robot · 2026-06-26T05:01:17Z

✅ CI Loop Results `c6b76a2`

Stage	Result	Duration	Tests
✅ Build	SUCCESS	9.4min	-
✅ Code-Check	SUCCESS	6.3min	-
✅ UT-GO	SUCCESS	20.0min	1071 total, 1071 passed, 0 failed
✅ UT-Integration	SUCCESS	24.0min	46 total, 46 passed, 0 failed
✅ UT-CPP-Cov	SUCCESS	41.9min	8000 total, 8000 passed, 0 failed

Total: 68min | Pipeline | Artifacts

Overall Coverage: 72.2%
Diff Coverage: Go 81.2% (208 hit, 48 miss, 256 measurable lines, 206 unmeasured)
Diff Coverage HTML: view changed lines
Total Patch Coverage: 81.3% (208/256 measurable lines, 206 unmeasured)

sre-ci-robot requested review from XuanYang-cn and godchen0212 June 26, 2026 03:22

sre-ci-robot added the size/XL Denotes a PR that changes 500-999 lines. label Jun 26, 2026

mergify Bot added dco-passed DCO check passed. kind/enhancement Issues or changes related to enhancement labels Jun 26, 2026

liliu-z reviewed Jun 26, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

enhance: Add Hugging Face inference provider support#50818

enhance: Add Hugging Face inference provider support#50818
junjiejiangjjj wants to merge 1 commit into
milvus-io:masterfrom
junjiejiangjjj:hf-infer

junjiejiangjjj commented Jun 26, 2026

Uh oh!

sre-ci-robot commented Jun 26, 2026

Uh oh!

sre-ci-robot commented Jun 26, 2026

Uh oh!

liliu-z Jun 26, 2026

Uh oh!

junjiejiangjjj Jun 26, 2026

Uh oh!

liliu-z Jun 26, 2026

Uh oh!

sre-ci-robot commented Jun 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

junjiejiangjjj commented Jun 26, 2026

Uh oh!

sre-ci-robot commented Jun 26, 2026

Uh oh!

sre-ci-robot commented Jun 26, 2026

Uh oh!

liliu-z Jun 26, 2026

Choose a reason for hiding this comment

Uh oh!

junjiejiangjjj Jun 26, 2026

Choose a reason for hiding this comment

Uh oh!

liliu-z Jun 26, 2026

Choose a reason for hiding this comment

Uh oh!

sre-ci-robot commented Jun 26, 2026

✅ CI Loop Results c6b76a2

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

✅ CI Loop Results `c6b76a2`