Extract embeddings for speaker identification?

Hi! This is a great library, thanks for open sourcing it.

Is it possible to extract embeddings from this model that can then be clustered for speaker identification?  E.g. could I take the output of the encoder here before the combined embedding is created?

https://github.com/skit-ai/SpeechLLM/blob/f44d361277ae5e2fa687b39f861f630ca2571318/huggingface/hf_repo/model.py#L68

I'm new to speech processing so please forgive me if that's daft. Thanks!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Extract embeddings for speaker identification? #3

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Extract embeddings for speaker identification? #3

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions