Vantage6 algorithm for patient similarity for the HealthAI PoC

This algorithm was designed for the vantage6 architecture.

Input data

The algorithm expects each data node to hold a csv with the following data and adhering to the following standard:

{
  "id": {
    "description": "patient identifier",
    "type": "string"
  },
  "t": {
    "description": "patient t stage",
    "type": ["integer", "categorical"],
    "values": [
      [-1, 0, 1, 2, 3, 4], 
      ["Tx", "T1a", "T1b", "T1c", "T2a", "T2b", "T3", "T4"]
    ]
  },
  "n": {
    "description": "patient n stage",
    "type": ["integer", "categorical"],
    "values": [[-1, 0, 1, 2, 3], ["Nx", "N0", "N1", "N2", "N3"]]
  },
  "m": {
    "description": "patient m stage",
    "type": ["integer", "categorical"],
    "values": [[-1, 0, 1], ["Mx", "M0", "M1a", "M1b", "M1c"]]
  },
  "date_of_diagnosis": {
      "description": "date the patient was diagnosed",
      "type": "string",
      "format": "%Y-%m-%d"
  },
  "date_of_fu": {
      "description": "date the patient had the last follow up visit",
      "type": "string",
      "format": "%Y-%m-%d"
  },
  "vital_status": {
      "description": "patient vital status",
      "type": "categorical",
      "values": ["alive", "dead"]
  },
}

Using the algorithm

Below you can see an example of how to run the algorithm:

import time
from vantage6.client import Client

# Initialise the client
client = Client('http://127.0.0.1', 5000, '/api')
client.authenticate('username', 'password')
client.setup_encryption(None)

# Define algorithm input
input_ = {
    'method': 'master',
    'master': True,
    'kwargs': {
        'org_ids': [2, 3],          # organisations to run kmeans
        'k': 4,                     # number of clusters to compute
        'epsilon': 0.01,            # threshold for convergence criterion
        'max_iter': 50,             # maximum number of iterations to perform
        'columns': ['t', 'n', 'm']  # columns to be used for clustering
    }
}

# Send the task to the central server
task = client.task.create(
    collaboration=1,
    organizations=[2, 3],
    name='v6-healthai-patient-similarity-py',
    image='ghcr.io/maastrichtu-cds/v6-healthai-patient-similarity-py:latest',
    description='run tnm patient similarity',
    input=input_,
    data_format='json'
)

# Retrieve the results
task_info = client.task.get(task['id'], include_results=True)
while not task_info.get('complete'):
    task_info = client.task.get(task['id'], include_results=True)
    time.sleep(1)
result_info = client.result.list(task=task_info['id'])
results = result_info['data'][0]['result']

Testing locally

If you wish to test the algorithm locally, you can create a Python virtual environment, using your favourite method, and do the following:

source .venv/bin/activate
pip install -e .
python v6_kmeans_py/example.py

The algorithm was developed and tested with Python 3.7.

Acknowledgments

This project was financially supported by the AiNed foundation.

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
.github/workflows		.github/workflows
v6_healthai_patient_similarity_py		v6_healthai_patient_similarity_py
.dockerignore		.dockerignore
.flake8		.flake8
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Vantage6 algorithm for patient similarity for the HealthAI PoC

Input data

Using the algorithm

Testing locally

Acknowledgments

About

Releases

Packages

Languages

License

MaastrichtU-CDS/v6-healthai-patient-similarity-py

Folders and files

Latest commit

History

Repository files navigation

Vantage6 algorithm for patient similarity for the HealthAI PoC

Input data

Using the algorithm

Testing locally

Acknowledgments

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages