predict results are the same when loaded model from multi GPU

### Description of the bug:


I loaded the model（chat-27B）from multi gpu by accelerate lib, then I run the "BindingDB_kd" task after input Drug SMILES and Target amino acid sequence I got result（374）.  However, When I change another Drug SMILES I got the same result. Here is my code, is there something wrong?

### Actual vs expected behavior:


_No response_

### Any other information you'd like to share?


`
import json
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline
from accelerate import Accelerator


model_name = "/data/models/AIDD/google/txgemma-27b-chat/"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    device_map="auto",
    low_cpu_mem_usage=True,
)
accelerator = Accelerator()
model = accelerator.prepare(model)

tdc_prompts_filepath = "/data/models/AIDD/google/txgemma-27b-chat/tdc_prompts.json"
with open(tdc_prompts_filepath, "r") as f:
    tdc_prompts_json = json.load(f)
    
task_name = "BindingDB_kd"
input1_type = "{Drug SMILES}"
drug_smiles = "CCN(CC)CCNC(=O)c1c(C)[nH]c(\C=C2/C(=O)Nc3ccc(F)cc23)c1C"      # 15nm
input2_type = "{Target amino acid sequence}"
amino_acid_sequence = "MELRVGNRYRLGRKIGSGSFGDIYLGTDIAAGEEVAIKLECVKTKHPQLHIESKIYKMMQGGVGIPTIRWCGAEGDYNVMVMELLGPSLEDLFNFCSRKFSLKTVLLLADQMISRIEYIHSKNFIHRDVKPDNFLMGLGKKGNLVYIIDFGLAKKYRDARTHQHIPYRENKNLTGTARYASINTHLGIEQSRRDDLESLGYVLMYFNLGSLPWQGLKAATKRQKYERISEKKMSTPIEVLCKGYPSEFATYLNFCRSLRFDDKPDYSYLRQLFRNLFHRQGFSYDYVFDWNMLKFGASRAADDAERERRDREERLRHSRNPATRGLPSTASGRLRGTQEVAPPTPLTPTSHTANTSPRPVSGMERERKVSMRLHRGAPVNISSSDLTGRQDTSRMSTSQIPGRVASSGLQSVVHR"
TDC_PROMPT = tdc_prompts_json[task_name].replace(input1_type, drug_smiles).replace(input2_type, amino_acid_sequence)

inputs = tokenizer(TDC_PROMPT, return_tensors="pt")
inputs = {k: v.to(accelerator.device) for k, v in inputs.items()}

model.eval()
with torch.no_grad():
    outputs = model.generate(**inputs, max_new_tokens=8)
    generated_text = tokenizer.decode(outputs[0], skip_special_tokens=True)

if accelerator.is_main_process:
    print(generated_text)
`

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

predict results are the same when loaded model from multi GPU #184

Description of the bug:

Actual vs expected behavior:

Any other information you'd like to share?

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

predict results are the same when loaded model from multi GPU #184

Description

Description of the bug:

Actual vs expected behavior:

Any other information you'd like to share?

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions