Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Direct inference on pandas dataframe #98

Open
rbhatia46 opened this issue May 9, 2022 · 2 comments
Open

Direct inference on pandas dataframe #98

rbhatia46 opened this issue May 9, 2022 · 2 comments

Comments

@rbhatia46
Copy link

Hi,
I see that to make a new inference everytime, I have to save a seperate CSV and then load it by providing path to dm.data.process_unlabeled

Is there a way to directly pass pandas dataframe to this function and perform inference without creating a new csv

@rbhatia46
Copy link
Author

@sidharthms could you please assist with this ?

@etiennekintzler
Copy link

etiennekintzler commented Jul 7, 2022

Hey @rbhatia46

It's possible to handle pandas.DataFrame by modifying MatchingDataset.__init__ and process_unlabeled (I've tried on a fork of the project). Should I make a PR @sidharthms or it's out of scope ?

To make it work without changing the source code you could also use a temporary file:

import os
import tempfile

import pandas as pd
import deepmatcher as dm

def run_prediction(df, model, **kwargs):
    fd, path = tempfile.mkstemp()
    try:
        with os.fdopen(fd, 'w') as tmp:
            tmp.write(df.to_csv(None, index=False))
        unlabeled = dm.data.process_unlabeled(path=path, trained_model=model)
        predictions = model.run_prediction(unlabeled, **kwargs)
    finally:
        os.remove(path)
        return predictions

Then

model = dm.MatchingModel()
model.load_state('path/to/model.pth')
df = pd.DataFrame({
    "id": [0], "left_name": ["surname"], "right_name": ["name surname"]
})

run_prediction(df, model, output_attributes=True)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants